孔夫子舊書網(wǎng)作為國內(nèi)核心的舊書交易與古籍?dāng)?shù)據(jù)平臺(tái),其開放接口承載著古籍、善本、舊書等特色商品的元數(shù)據(jù)獲取功能 —— 不同于普通電商接口,其數(shù)據(jù)結(jié)構(gòu)包含 “年代”“品相”“版本” 等古籍專屬字段,且對(duì)調(diào)用頻率與簽名合規(guī)性要求嚴(yán)格。本文從實(shí)戰(zhàn)角度拆解接口調(diào)用全流程,覆蓋認(rèn)證簽名、古籍檢索、商鋪集成、特色數(shù)據(jù)處理四大核心場景,提供可直接復(fù)用的 Python 代碼與避坑指南,助力古籍?dāng)?shù)字化、學(xué)術(shù)研究、舊書商管理等業(yè)務(wù)快速落地。
一、接口調(diào)用前置準(zhǔn)備(聚焦古籍特色)
1. 核心參數(shù)與接口規(guī)范(必知要點(diǎn))
調(diào)用孔夫子接口前需明確基礎(chǔ)參數(shù)與平臺(tái)限制,尤其關(guān)注古籍專屬字段的配置要求:
參數(shù)類別 | 具體項(xiàng) | 說明 | 是否必選 |
認(rèn)證參數(shù) | appKey | 平臺(tái)分配的應(yīng)用唯一標(biāo)識(shí)(注冊(cè)后在開放平臺(tái)獲?。?/td> | 是 |
appSecret | 簽名密鑰(需妥善保管,建議通過環(huán)境變量存儲(chǔ),避免硬編碼) | 是 | |
timestamp | 毫秒級(jí)時(shí)間戳(與平臺(tái)服務(wù)器時(shí)間偏差≤5 分鐘,否則簽名失效) | 是 | |
signature | 按平臺(tái)規(guī)則生成的簽名字符串(MD5 加密,32 位小寫) | 是 | |
古籍檢索專屬參數(shù) | era | 年代(如 “清代”“民國”“明代”,普通電商接口無此字段) | 否 |
bookCondition | 品相編碼(1 = 全新→8 = 八五品以下,古籍業(yè)務(wù)核心篩選條件) | 否 | |
categoryId | 古籍分類 ID(如 “經(jīng)部”“史部”,需從/v2/categories接口獲?。?/td> | 否 | |
通用控制參數(shù) | page/pageSize | 分頁參數(shù)(pageSize 最大 20,超限會(huì)被截?cái)啵?/td> | 否 |
sort | 排序方式(支持price_asc/publish_time_desc,古籍常用 “年代倒序”) | 否 |
2. 核心接口列表(按業(yè)務(wù)場景分類)
接口名稱 | 接口地址 | 核心功能 | 適用場景 |
圖書檢索 | /v2/books/search | 按關(guān)鍵詞、作者、年代、品相檢索古籍 / 舊書 | 古籍批量篩選、學(xué)術(shù)樣本采集 |
圖書詳情 | /v2/books/detail | 獲取單本圖書的詳細(xì)元數(shù)據(jù)(含版本、頁數(shù)、描述) | 古籍詳情展示、數(shù)字化存檔 |
商鋪檢索 | /v2/shops/search | 按地域、主營類目檢索舊書商鋪 | 商鋪合作篩選、多店比價(jià) |
商鋪詳情 | /v2/shops/detail | 獲取商鋪信息 + 在售商品列表(支持指定商品數(shù)量) | 商鋪數(shù)據(jù)集成、貨源監(jiān)控 |
分類列表 | /v2/categories | 獲取圖書分類體系(含古籍專屬類目) | 類目篩選條件構(gòu)造、業(yè)務(wù)分類管理 |
3. 簽名生成規(guī)則(避坑核心步驟)
孔夫子采用 “參數(shù)排序 + MD5 加密” 的簽名機(jī)制,任一環(huán)節(jié)錯(cuò)誤會(huì)直接返回401認(rèn)證失敗,步驟如下:
- 參數(shù)過濾:移除空值參數(shù)與signature本身,保留非空的業(yè)務(wù)參數(shù)與公共參數(shù);
- ASCII 排序:按參數(shù)名 ASCII 碼升序排序(如appKey在bookCondition前,era在timestamp前);
- 字符串拼接:按key=value&key=value格式拼接排序后的參數(shù)(例:appKey=xxx&era=清代×tamp=1719000000000);
- 密鑰追加:在拼接字符串末尾直接追加appSecret(無分隔符,例:上述字符串 +abc123def);
- MD5 加密:對(duì)最終字符串進(jìn)行 UTF-8 編碼后,通過 MD5 加密生成 32 位小寫字符串,即為signature。
二、核心技術(shù)實(shí)現(xiàn)(突出古籍特色處理)
1. 通用認(rèn)證工具類(適配全接口)
封裝簽名生成與時(shí)間戳獲取邏輯,支持所有接口復(fù)用,避免重復(fù)開發(fā):
import hashlibimport timefrom urllib.parse import urlencodeclass KongfzAuthUtil: """孔夫子開放平臺(tái)認(rèn)證工具類(支持所有接口簽名生成)""" @staticmethod def generate_sign(params: dict, app_secret: str) -> str: """ 生成簽名(嚴(yán)格遵循孔夫子MD5簽名規(guī)則) :param params: 待簽名參數(shù)字典(含公共參數(shù)與業(yè)務(wù)參數(shù)) :param app_secret: 應(yīng)用密鑰 :return: 32位小寫簽名字符串(None表示生成失?。? """ try: # 1. 過濾空值與signature字段 valid_params = { k: v for k, v in params.items() if v is not None and v != "" and k != "signature" } # 2. 按參數(shù)名ASCII升序排序 sorted_params = sorted(valid_params.items(), key=lambda x: x[0]) # 3. 拼接"key=value&key=value"格式 param_str = urlencode(sorted_params) # 4. 追加appSecret并MD5加密 sign_str = f"{param_str}{app_secret}" sign = hashlib.md5(sign_str.encode("utf-8")).hexdigest().lower() return sign except Exception as e: print(f"簽名生成失?。簕str(e)}") return None @staticmethod def get_timestamp() -> int: """獲取當(dāng)前毫秒級(jí)時(shí)間戳(避免與平臺(tái)時(shí)間偏差超限)""" return int(time.time() * 1000)
2. 古籍圖書接口客戶端(專屬字段解析)
針對(duì)古籍的 “年代”“品相”“版本” 字段做專項(xiàng)解析,輸出結(jié)構(gòu)化數(shù)據(jù),減少業(yè)務(wù)端處理成本:
import requestsimport timefrom threading import Lockfrom datetime import datetimefrom typing import Dict, List, Optionalclass KongfzBookClient: """孔夫子古籍圖書接口客戶端(含古籍特色字段解析)""" def __init__(self, app_key: str, app_secret: str): self.app_key = app_key self.app_secret = app_secret self.base_url = "https://open.kongfz.com/api" self.timeout = 15 # 超時(shí)時(shí)間(秒,古籍?dāng)?shù)據(jù)可能較大,建議設(shè)15-20秒) self.qps_limit = 3 # 平臺(tái)QPS限制(單應(yīng)用最大3,超限返回429) self.last_request_time = 0 self.request_lock = Lock() # 線程鎖控制QPS def _get_common_params(self) -> Dict: """生成所有接口通用的公共參數(shù)""" return { "appKey": self.app_key, "timestamp": KongfzAuthUtil.get_timestamp(), "format": "json" # 固定返回JSON格式 } def _control_qps(self) -> None: """QPS限流(避免觸發(fā)429錯(cuò)誤,核心優(yōu)化點(diǎn))""" with self.request_lock: current_time = time.time() min_interval = 1.0 / self.qps_limit # 最小請(qǐng)求間隔(≈0.33秒) elapsed = current_time - self.last_request_time if elapsed < min_interval: time.sleep(min_interval - elapsed) # 補(bǔ)全間隔 self.last_request_time = current_time def search_ancient_books(self, **kwargs) -> Optional[Dict]: """ 古籍專屬檢索(支持年代、品相、類目篩選) :param kwargs: 檢索參數(shù)(含era、bookCondition、keyword等) :return: 結(jié)構(gòu)化檢索結(jié)果(含分頁信息+古籍列表) """ self._control_qps() # 1. 構(gòu)造請(qǐng)求URL與參數(shù) url = f"{self.base_url}/v2/books/search" params = self._get_common_params() # 篩選有效業(yè)務(wù)參數(shù)(僅保留接口支持的字段) valid_params = ["keyword", "author", "era", "bookCondition", "categoryId", "minPrice", "maxPrice", "page", "pageSize", "sort"] for param in valid_params: if param in kwargs and kwargs[param] is not None: params[param] = kwargs[param] # 2. 生成簽名 params["signature"] = KongfzAuthUtil.generate_sign(params, self.app_secret) # 3. 發(fā)送請(qǐng)求并處理響應(yīng) try: response = requests.post( url, json=params, headers={ "Content-Type": "application/json;charset=utf-8", "User-Agent": "KongfzAncientBookClient/1.0" }, timeout=self.timeout ) response.raise_for_status() # 捕獲4xx/5xx錯(cuò)誤 result = response.json() # 4. 業(yè)務(wù)錯(cuò)誤判斷(code=200為成功) if result.get("code") != 200: raise Exception(f"檢索失敗:{result.get('message', '未知錯(cuò)誤')}(code:{result.get('code')})") # 5. 解析古籍?dāng)?shù)據(jù)(處理特色字段) return self._parse_ancient_book_result(result.get("data", {})) except Exception as e: print(f"古籍檢索異常:{str(e)}") return None def get_book_detail(self, book_id: str) -> Optional[Dict]: """獲取單本古籍詳情(含版本、內(nèi)容描述、商鋪信息)""" self._control_qps() url = f"{self.base_url}/v2/books/detail" params = self._get_common_params() params["id"] = book_id params["signature"] = KongfzAuthUtil.generate_sign(params, self.app_secret) try: response = requests.post( url, json=params, headers={"Content-Type": "application/json;charset=utf-8"}, timeout=self.timeout ) response.raise_for_status() result = response.json() if result.get("code") != 200: raise Exception(f"詳情獲取失?。簕result.get('message')}(code:{result.get('code')})") return self._parse_book_detail(result.get("data", {})) except Exception as e: print(f"古籍詳情異常(ID:{book_id}):{str(e)}") return None def _parse_ancient_book_result(self, raw_data: Dict) -> Optional[Dict]: """解析古籍檢索結(jié)果(重點(diǎn)處理年代、品相字段)""" if not raw_data or "items" not in raw_data: return None # 處理分頁信息 search_info = { "total": raw_data.get("total", 0), "page": raw_data.get("page", 1), "page_size": raw_data.get("pageSize", 20), "total_page": raw_data.get("totalPage", 0) } # 解析單本古籍?dāng)?shù)據(jù) ancient_books = [] for item in raw_data["items"]: # 品相編碼轉(zhuǎn)描述(如2→九五品,古籍業(yè)務(wù)常用) book_condition_desc = item.get("bookConditionDesc", "") or self._map_condition_code(item.get("bookCondition", 0)) ancient_books.append({ "book_id": item.get("id", ""), "title": item.get("title", ""), "author": item.get("author", ""), "era": item.get("era", "未知年代"), # 古籍核心字段 "book_condition": { "code": item.get("bookCondition", 0), "desc": book_condition_desc }, "price": float(item.get("price", 0)), "publisher": item.get("publisher", "未知出版社"), "publish_time": item.get("publishTime", "未知時(shí)間"), "cover_img": self._complete_img_url(item.get("coverImg", "")), "shop_info": { "id": item.get("shopId", ""), "name": item.get("shopName", "") }, "tags": item.get("tags", []), # 如["儒家經(jīng)典","清代刻本"] "fetch_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") }) return {"search_info": search_info, "ancient_books": ancient_books} def _parse_book_detail(self, raw_data: Dict) -> Optional[Dict]: """解析古籍詳情(含版本、內(nèi)容描述等深度字段)""" if not raw_data: return None return { "book_id": raw_data.get("id", ""), "title": raw_data.get("title", ""), "subtitle": raw_data.get("subtitle", ""), "author": raw_data.get("author", ""), "translator": raw_data.get("translator", ""), "era": raw_data.get("era", "未知年代"), "version": raw_data.get("edition", "未知版本"), # 古籍版本(如"清代刻本") "binding": raw_data.get("binding", "未知裝幀"), # 裝幀(線裝/平裝,古籍多線裝) "pages": raw_data.get("pages", 0), # 頁數(shù)(古籍常用"卷"表述,需業(yè)務(wù)端轉(zhuǎn)換) "price": float(raw_data.get("price", 0)), "book_condition": { "code": raw_data.get("bookCondition", 0), "desc": raw_data.get("bookConditionDesc", "") or self._map_condition_code(raw_data.get("bookCondition", 0)) }, "description": raw_data.get("description", "無詳細(xì)描述"), # 古籍保存狀況、瑕疵說明 "content_desc": raw_data.get("contentDesc", "無內(nèi)容簡介"), # 內(nèi)容摘要(學(xué)術(shù)研究用) "images": [self._complete_img_url(img) for img in raw_data.get("images", [])], # 多圖展示 "shop_info": { "id": raw_data.get("shopId", ""), "name": raw_data.get("shopName", ""), "score": float(raw_data.get("shopScore", 0)), "location": raw_data.get("shopLocation", "未知地域") # 商鋪所在地(古籍貨源地域分析) }, "fetch_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") } def _map_condition_code(self, code: int) -> str: """品相編碼映射為文字描述(古籍業(yè)務(wù)專屬映射)""" condition_map = { 1: "全新", 2: "九五品", 3: "九品", 4: "八五品", 5: "八品", 6: "七品", 7: "六品", 8: "八五品以下" } return condition_map.get(code, "未知品相") def _complete_img_url(self, url: str) -> str: """補(bǔ)全圖片URL(處理相對(duì)路徑,避免404)""" if not url: return "" if url.startswith(("http://", "https://")): return url return f"https://img.kongfz.com{url}" if not url.startswith("http://") else f"https:{url}"
3. 商鋪數(shù)據(jù)集成客戶端(含在售古籍獲?。?/h3>
針對(duì)舊書商管理、多店比價(jià)場景,封裝商鋪檢索與詳情接口,支持獲取商鋪在售古籍列表:
class KongfzShopClient: """孔夫子商鋪接口客戶端(支持在售古籍?dāng)?shù)據(jù)獲?。?"" def __init__(self, app_key: str, app_secret: str): self.app_key = app_key self.app_secret = app_secret self.base_url = "https://open.kongfz.com/api" self.timeout = 15 self.qps_limit = 3 self.last_request_time = 0 self.request_lock = Lock() def _get_common_params(self) -> Dict: return { "appKey": self.app_key, "timestamp": KongfzAuthUtil.get_timestamp(), "format": "json" } def _control_qps(self) -> None: with self.request_lock: current_time = time.time() min_interval = 1.0 / self.qps_limit elapsed = current_time - self.last_request_time if elapsed < min_interval: time.sleep(min_interval - elapsed) self.last_request_time = current_time def search_shops(self, **kwargs) -> Optional[Dict]: """檢索舊書商鋪(支持地域、主營類目、評(píng)分篩選)""" self._control_qps() url = f"{self.base_url}/v2/shops/search" params = self._get_common_params() # 有效商鋪檢索參數(shù) valid_params = ["keyword", "categoryId", "location", "minScore", "minSales", "isVip", "page", "pageSize", "sort"] for param in valid_params: if param in kwargs and kwargs[param] is not None: params[param] = kwargs[param] params["signature"] = KongfzAuthUtil.generate_sign(params, self.app_secret) try: response = requests.post( url, json=params, headers={"Content-Type": "application/json;charset=utf-8"}, timeout=self.timeout ) response.raise_for_status() result = response.json() if result.get("code") != 200: raise Exception(f"商鋪檢索失?。簕result.get('message')}(code:{result.get('code')})") return self._parse_shop_search_result(result.get("data", {})) except Exception as e: print(f"商鋪檢索異常:{str(e)}") return None def get_shop_detail(self, shop_id: str, goods_count: int = 5) -> Optional[Dict]: """獲取商鋪詳情+在售古籍(默認(rèn)返回5件,最大20件)""" self._control_qps() url = f"{self.base_url}/v2/shops/detail" params = self._get_common_params() params["id"] = shop_id params["goodsCount"] = min(goods_count, 20) # 限制最大返回?cái)?shù),避免數(shù)據(jù)過載 params["signature"] = KongfzAuthUtil.generate_sign(params, self.app_secret) try: response = requests.post( url, json=params, headers={"Content-Type": "application/json;charset=utf-8"}, timeout=self.timeout ) response.raise_for_status() result = response.json() if result.get("code") != 200: raise Exception(f"商鋪詳情失?。簕result.get('message')}(code:{result.get('code')})") return self._parse_shop_detail(result.get("data", {})) except Exception as e: print(f"商鋪詳情異常(ID:{shop_id}):{str(e)}") return None def _parse_shop_search_result(self, raw_data: Dict) -> Optional[Dict]: """解析商鋪檢索結(jié)果""" if not raw_data or "items" not in raw_data: return None search_info = { "total": raw_data.get("total", 0), "page": raw_data.get("page", 1), "page_size": raw_data.get("pageSize", 20), "total_page": raw_data.get("totalPage", 0) } shops = [] for item in raw_data["items"]: shops.append({ "shop_id": item.get("id", ""), "name": item.get("name", ""), "location": item.get("location", "未知地域"), # 古籍貨源地域(如"北京潘家園") "score": float(item.get("score", 0)), # 商鋪評(píng)分(篩選優(yōu)質(zhì)貨源) "sales": item.get("sales", 0), # 總銷量(可信度參考) "goods_count": item.get("goodsCount", 0), # 在售商品數(shù) "is_vip": item.get("isVip", False), # 是否VIP商鋪(服務(wù)保障更高) "main_category": item.get("mainCategory", "未知主營"), # 如"古籍善本" "fetch_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") }) return {"search_info": search_info, "shops": shops} def _parse_shop_detail(self, raw_data: Dict) -> Optional[Dict]: """解析商鋪詳情(含在售古籍列表)""" if not raw_data: return None # 解析在售古籍(僅保留核心字段) on_sale_books = [] for p in raw_data.get("products", []): on_sale_books.append({ "book_id": p.get("id", ""), "title": p.get("title", ""), "price": float(p.get("price", 0)), "book_condition": p.get("bookConditionDesc", "未知品相"), "cover_img": self._complete_img_url(p.get("coverImg", "")), "publish_time": p.get("publishTime", "未知時(shí)間") }) return { "shop_id": raw_data.get("id", ""), "name": raw_data.get("name", ""), "logo": self._complete_img_url(raw_data.get("logo", "")), "location": raw_data.get("location", "未知地域"), "score": float(raw_data.get("score", 0)), "score_detail": raw_data.get("scoreDetail", {}), # 評(píng)分明細(xì)(服務(wù)/物流/描述) "sales": raw_data.get("sales", 0), "month_sales": raw_data.get("monthSales", 0), # 月銷量(近期活躍度) "goods_count": raw_data.get("goodsCount", 0), "business_scope": raw_data.get("businessScope", "無經(jīng)營范圍"), # 如"主營清代古籍、民國期刊" "on_sale_books": on_sale_books, # 在售古籍列表 "fetch_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") } def _complete_img_url(self, url: str) -> str: """復(fù)用圖片URL補(bǔ)全邏輯""" if not url: return "" if url.startswith(("http://", "https://")): return url return f"https://img.kongfz.com{url}" if not url.startswith("http://") else f"https:{url}"
4. 數(shù)據(jù)管理器(緩存 + 批量處理)
針對(duì)古籍?dāng)?shù)據(jù)更新慢、重復(fù)調(diào)用頻繁的特點(diǎn),封裝緩存與批量處理功能,減少接口調(diào)用量:
import osimport jsonimport sqlite3from datetime import datetime, timedeltaimport timeclass KongfzDataManager: """孔夫子數(shù)據(jù)管理器(支持緩存、批量處理、過期清理)""" def __init__(self, app_key: str, app_secret: str, cache_dir: str = "./kongfz_cache"): self.book_client = KongfzBookClient(app_key, app_secret) self.shop_client = KongfzShopClient(app_key, app_secret) self.cache_dir = cache_dir self.db_path = os.path.join(cache_dir, "kongfz_cache.db") self._init_cache_db() # 初始化緩存數(shù)據(jù)庫 def _init_cache_db(self) -> None: """創(chuàng)建圖書、商鋪、搜索結(jié)果緩存表""" if not os.path.exists(self.cache_dir): os.makedirs(self.cache_dir) conn = sqlite3.connect(self.db_path) cursor = conn.cursor() # 古籍緩存表(有效期長,因古籍?dāng)?shù)據(jù)變動(dòng)少) cursor.execute(''' CREATE TABLE IF NOT EXISTS ancient_book_cache ( book_id TEXT PRIMARY KEY, data TEXT, fetch_time TEXT ) ''') # 搜索結(jié)果緩存表(有效期短,避免數(shù)據(jù)過時(shí)) cursor.execute(''' CREATE TABLE IF NOT EXISTS search_cache ( cache_key TEXT PRIMARY KEY, data TEXT, fetch_time TEXT, keyword TEXT ) ''') # 商鋪緩存表 cursor.execute(''' CREATE TABLE IF NOT EXISTS shop_cache ( shop_id TEXT PRIMARY KEY, data TEXT, fetch_time TEXT ) ''') conn.commit() conn.close() def batch_get_ancient_books(self, book_ids: List[str], cache_ttl: int = 86400) -> List[Dict]: """批量獲取古籍詳情(支持緩存,默認(rèn)緩存1天)""" books = [] for book_id in book_ids: # 優(yōu)先從緩存獲取 cached = self._get_cached("ancient_book_cache", book_id, cache_ttl) if cached: books.append(cached) continue # 緩存未命中,調(diào)用接口 detail = self.book_client.get_book_detail(book_id) if detail: books.append(detail) self._update_cache("ancient_book_cache", book_id, detail) time.sleep(0.5) # 額外間隔,避免QPS超限 return books def search_ancient_books_with_cache(self, keyword: str, cache_ttl: int = 3600, **kwargs) -> Optional[Dict]: """帶緩存的古籍檢索(搜索結(jié)果緩存1小時(shí))""" # 生成唯一緩存鍵(含關(guān)鍵詞與分頁參數(shù)) cache_key = self._generate_cache_key(keyword, **kwargs) # 嘗試緩存獲取 cached = self._get_cached("search_cache", cache_key, cache_ttl) if cached: print(f"使用緩存:古籍檢索(關(guān)鍵詞:{keyword},頁碼:{kwargs.get('page',1)})") return cached # 接口獲取并更新緩存 result = self.book_client.search_ancient_books(keyword=keyword, **kwargs) if result: self._update_cache("search_cache", cache_key, result, keyword=keyword) return result def _generate_cache_key(self, keyword: str, **kwargs) -> str: """生成搜索結(jié)果的唯一緩存鍵(避免重復(fù)緩存)""" sorted_params = sorted(kwargs.items(), key=lambda x: x[0]) params_str = "&".join([f"{k}={v}" for k, v in sorted_params]) return hashlib.md5(f"ancient_search_{keyword}_{params_str}".encode()).hexdigest() def _get_cached(self, table: str, key: str, ttl: int) -> Optional[Dict]: """從緩存表獲取數(shù)據(jù)(校驗(yàn)有效期)""" conn = sqlite3.connect(self.db_path) cursor = conn.cursor() # 圖書/商鋪緩存表的主鍵是xxx_id,搜索緩存表是cache_key id_col = f"{table[:-6]}_id" if "search" not in table else "cache_key" cursor.execute(f"SELECT data, fetch_time FROM {table} WHERE {id_col} = ?", (key,)) record = cursor.fetchone() conn.close() if not record: return None data_str, fetch_time = record # 檢查是否過期 fetch_dt = datetime.strptime(fetch_time, "%Y-%m-%d %H:%M:%S") if (datetime.now() - fetch_dt).total_seconds() > ttl: return None try: return json.loads(data_str) except json.JSONDecodeError: return None def _update_cache(self, table: str, key: str, data: Dict, keyword: str = "") -> None: """更新緩存表數(shù)據(jù)(插入或替換)""" conn = sqlite3.connect(self.db_path) cursor = conn.cursor() data_str = json.dumps(data, ensure_ascii=False) fetch_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") id_col = f"{table[:-6]}_id" if "search" not in table else "cache_key" if table == "search_cache": # 搜索緩存需存儲(chǔ)關(guān)鍵詞 cursor.execute(f''' INSERT OR REPLACE INTO {table} ({id_col}, data, fetch_time, keyword) VALUES (?, ?, ?, ?) ''', (key, data_str, fetch_time, keyword)) else: cursor.execute(f''' INSERT OR REPLACE INTO {table} ({id_col}, data, fetch_time) VALUES (?, ?, ?) ''', (key, data_str, fetch_time)) conn.commit() conn.close() def clean_expired_cache(self, max_age: int = 86400 * 7) -> Dict: """清理過期緩存(默認(rèn)保留7天數(shù)據(jù))""" conn = sqlite3.connect(self.db_path) cursor = conn.cursor() expire_time = (datetime.now() - timedelta(seconds=max_age)).strftime("%Y-%m-%d %H:%M:%S") deleted = {} # 清理古籍緩存 cursor.execute("DELETE FROM ancient_book_cache WHERE fetch_time < ?", (expire_time,)) deleted["ancient_book"] = cursor.rowcount # 清理搜索緩存 cursor.execute("DELETE FROM search_cache WHERE fetch_time < ?", (expire_time,)) deleted["search"] = cursor.rowcount # 清理商鋪緩存 cursor.execute("DELETE FROM shop_cache WHERE fetch_time < ?", (expire_time,)) deleted["shop"] = cursor.rowcount conn.commit() conn.close() print(f"緩存清理完成:古籍{deleted['ancient_book']}條,搜索{deleted['search']}條,商鋪{deleted['shop']}條") return deleted
三、實(shí)戰(zhàn)示例(覆蓋核心業(yè)務(wù)場景)
1. 古籍批量檢索與詳情獲?。▽W(xué)術(shù)研究場景)
def ancient_book_research_demo(): """示例:檢索清代古籍并獲取詳情(學(xué)術(shù)樣本采集)""" # 1. 替換為自身的appKey和appSecret(從孔夫子開放平臺(tái)獲?。? APP_KEY = "your_app_key" APP_SECRET = "your_app_secret" # 2. 初始化數(shù)據(jù)管理器(含緩存) data_manager = KongfzDataManager(APP_KEY, APP_SECRET) # 3. 檢索參數(shù)(清代、九品及以上、關(guān)鍵詞"論語",適合學(xué)術(shù)研究) search_params = { "keyword": "論語", "era": "清代", "bookCondition": 3, # 3=九品(學(xué)術(shù)研究對(duì)品相要求較高) "minPrice": 100, "maxPrice": 5000, "page": 1, "pageSize": 10, "sort": "publish_time_desc" # 按出版時(shí)間倒序(優(yōu)先獲取晚期刻本) } # 4. 帶緩存檢索(避免重復(fù)調(diào)用) print("=== 開始檢索清代古籍 ===") search_result = data_manager.search_ancient_books_with_cache(**search_params) if not search_result: print("古籍檢索失敗") return # 5. 打印檢索結(jié)果概覽 search_info = search_result["search_info"] print(f"檢索結(jié)果:共{search_info['total']}本清代論語相關(guān)古籍,第{search_info['page']}/{search_info['total_page']}頁") # 6. 打印單本古籍信息 for i, book in enumerate(search_result["ancient_books"], 1): print(f"\n{i}. 書名:{book['title']}") print(f" 作者:{book['author']} | 年代:{book['era']} | 品相:{book['book_condition']['desc']}") print(f" 價(jià)格:¥{book['price']} | 商家:{book['shop_info']['name']}") print(f" 封面:{book['cover_img'][:50]}...") # 7. 批量獲取前3本古籍的詳細(xì)信息(用于學(xué)術(shù)分析) if search_result["ancient_books"]: book_ids = [book["book_id"] for book in search_result["ancient_books"][:3]] print(f"\n=== 獲取{len(book_ids)}本古籍詳情 ===") book_details = data_manager.batch_get_ancient_books(book_ids) for detail in book_details: print(f"\n書名:{detail['title']}") print(f"版本:{detail['version']} | 裝幀:{detail['binding']} | 頁數(shù):{detail['pages']}頁") print(f"內(nèi)容摘要:{detail['content_desc'][:150]}...") # 打印前150字摘要 print(f"商家信息:{detail['shop_info']['name']}(評(píng)分:{detail['shop_info']['score']})") # 8. 清理過期緩存(可選) data_manager.clean_expired_cache()if __name__ == "__main__": ancient_book_research_demo()
2. 舊書商鋪篩選與在售古籍分析(舊書商貨源場景)
def ancient_shop_analysis_demo(): """示例:篩選北京地區(qū)優(yōu)質(zhì)古籍商鋪并分析在售商品""" APP_KEY = "your_app_key" APP_SECRET = "your_app_secret" data_manager = KongfzDataManager(APP_KEY, APP_SECRET) # 1. 檢索北京地區(qū)、主營古籍、評(píng)分4.5以上的VIP商鋪 shop_search_params = { "keyword": "古籍", "location": "北京", "minScore": 4.5, "isVip": 1, # 1=VIP商鋪(服務(wù)更有保障) "minSales": 1000, # 總銷量≥1000(篩選活躍商鋪) "page": 1, "pageSize": 5, "sort": "sales_desc" # 按銷量排序(優(yōu)先優(yōu)質(zhì)貨源) } print("=== 開始篩選北京古籍商鋪 ===") shop_result = data_manager.shop_client.search_shops(**shop_search_params) if not shop_result: print("商鋪篩選失敗") return # 2. 打印商鋪列表 print(f"篩選結(jié)果:共{shop_result['search_info']['total']}家符合條件的商鋪") for i, shop in enumerate(shop_result["shops"], 1): print(f"\n{i}. 商鋪名稱:{shop['name']}") print(f" 地域:{shop['location']} | 評(píng)分:{shop['score']} | 銷量:{shop['sales']}單") print(f" 在售古籍:{shop['goods_count']}本 | VIP:{'是' if shop['is_vip'] else '否'}") # 3. 獲取第一家商鋪的詳情與在售古籍 if shop_result["shops"]: first_shop_id = shop_result["shops"][0]["shop_id"] print(f"\n=== 獲取商鋪【{shop_result['shops'][0]['name']}】詳情 ===") shop_detail = data_manager.shop_client.get_shop_detail(first_shop_id, goods_count=8) if shop_detail: print(f"經(jīng)營范圍:{shop_detail['business_scope']}") print(f"本月銷量:{shop_detail['month_sales']}單 | 商家地域:{shop_detail['location']}") print(f"\n在售熱門古籍({len(shop_detail['on_sale_books'])}本):") for i, book in enumerate(shop_detail["on_sale_books"], 1): print(f"{i}. {book['title']} | 價(jià)格:¥{book['price']} | 品相:{book['book_condition']}")# 運(yùn)行示例# if __name__ == "__main__":# ancient_shop_analysis_demo()
四、避坑指南(針對(duì)孔夫子接口特色)
1. 高頻錯(cuò)誤與解決方案(附錯(cuò)誤碼)
錯(cuò)誤現(xiàn)象 | 錯(cuò)誤碼 | 可能原因 | 解決方案 |
簽名失敗 | 401 | 1. 時(shí)間戳偏差超 5 分鐘;2. 參數(shù)排序錯(cuò)誤;3. appSecret 錯(cuò)誤 | 1. 同步服務(wù)器時(shí)間;2. 確保參數(shù)按 ASCII 升序排序;3. 重新核對(duì) appSecret |
調(diào)用頻率超限 | 429 | QPS 超過 3,或日調(diào)用超 3000 次 | 1. 用_control_qps方法控制間隔;2. 增加緩存(尤其古籍詳情);3. 分時(shí)段調(diào)用 |
古籍年代篩選無結(jié)果 | 200 | 1. era 參數(shù)格式錯(cuò)誤(如 “清朝” 應(yīng)為 “清代”);2. 無對(duì)應(yīng)數(shù)據(jù) | 1. 參考平臺(tái)文檔使用標(biāo)準(zhǔn)年代值(清代 / 民國 / 明代);2. 放寬品相或價(jià)格限制 |
圖片 URL 404 | - | 接口返回相對(duì)路徑(如 “/books/123.jpg”) | 使用_complete_img_url方法補(bǔ)全為完整 HTTPS URL |
商鋪在售商品為空 | 200 | goodsCount 參數(shù)超限(最大 20)或商鋪無在售品 | 1. 限制 goodsCount≤20;2. 檢查商鋪是否正常營業(yè)(通過 shop_status 字段) |
2. 古籍特色數(shù)據(jù)處理技巧(差異化優(yōu)勢)
- 年代標(biāo)準(zhǔn)化:接口返回的 “era” 可能為 “清”“清代”“清朝”,需統(tǒng)一映射為標(biāo)準(zhǔn)值(如 “清”→“清代”),避免數(shù)據(jù)混亂;
- 品相描述增強(qiáng):將bookCondition編碼與文字描述結(jié)合展示(如 “3 - 九品”),同時(shí)補(bǔ)充業(yè)務(wù)說明(如 “九品:保存完好,略有磨損,無缺頁”);
- 多卷本處理:古籍常為 “全 X 冊(cè)”,需從description字段提取卷數(shù)信息(如用正則r'全(\d+)冊(cè)'匹配),補(bǔ)充到結(jié)構(gòu)化數(shù)據(jù)中;
- 版本區(qū)分:重點(diǎn)關(guān)注edition字段(如 “乾隆刻本”“民國影印本”),學(xué)術(shù)場景需單獨(dú)存儲(chǔ)版本信息,用于樣本分類。
3. 性能優(yōu)化建議(降低調(diào)用成本)
- 緩存分層策略:古籍詳情(變動(dòng)少)緩存 24 小時(shí),搜索結(jié)果(可能變動(dòng))緩存 1-6 小時(shí),商鋪信息緩存 12 小時(shí);
- 增量更新:記錄古籍的fetch_time,下次僅更新publish_time或price有變動(dòng)的商品,減少重復(fù)調(diào)用;
- 批量請(qǐng)求控制:批量獲取古籍詳情時(shí),除 QPS 控制外,額外增加 0.5 秒間隔,避免觸發(fā)平臺(tái)的 “突發(fā)流量限制”;
- 字段篩選:若僅需核心字段(如 title、era、price),可在請(qǐng)求參數(shù)中添加fields字段(如fields=title,era,price),減少數(shù)據(jù)傳輸量。
五、合規(guī)與擴(kuò)展建議
- 合規(guī)要點(diǎn):
- 數(shù)據(jù)用途限制:古籍?dāng)?shù)據(jù)僅用于自身業(yè)務(wù)(如學(xué)術(shù)研究、內(nèi)部管理),不得出售或用于惡意競爭;
- 調(diào)用頻率遵守:嚴(yán)格按 QPS=3、日調(diào)用 3000 次的限制設(shè)計(jì)邏輯,超限會(huì)導(dǎo)致賬號(hào)臨時(shí)封禁;
- 版權(quán)尊重:接口返回的圖書描述、圖片等內(nèi)容,需標(biāo)注 “來源孔夫子舊書網(wǎng)”,不得篡改或抹去來源信息。
- 擴(kuò)展方向:
- 古籍?dāng)?shù)字化存檔:結(jié)合get_book_detail接口獲取的description和images,構(gòu)建古籍?dāng)?shù)字檔案庫;
- 價(jià)格趨勢分析:基于緩存的歷史價(jià)格數(shù)據(jù),分析古籍(如清代刻本)的價(jià)格波動(dòng)規(guī)律;
- 多平臺(tái)集成:將孔夫子的古籍?dāng)?shù)據(jù)與其他古籍平臺(tái)(如中華書局?jǐn)?shù)字庫)對(duì)接,補(bǔ)充學(xué)術(shù)樣本。
- 若在接口對(duì)接中遇到 “古籍年代解析”“品相字段處理”“批量調(diào)用限流” 等具體問題,可在評(píng)論區(qū)說明場景(如 “檢索明代古籍無結(jié)果”),將針對(duì)性分享解決方案 —— 孔夫子接口的核心價(jià)值在于古籍特色數(shù)據(jù),做好專屬字段的處理,才能真正發(fā)揮其業(yè)務(wù)價(jià)值。