使用zipFile讀取文件時(shí)遇到的問(wèn)題及解決(KeyError: "There is no item named 'xxx' in the archive")
問(wèn)題描述
在Windows上跑一段代碼時(shí),遇到如下問(wèn)題:
KeyError: "There is no item named 'CDR_Data\\\\CDR.Corpus.v010516\\\\CDR_DevelopmentSet.PubTator.txt' in the archive"
原因分析
這是一段Python代碼,代碼中使用到了zipfile庫(kù),它首先創(chuàng)建了一個(gè)ZipFile對(duì)象,然后在調(diào)用read()函數(shù)時(shí)彈出了錯(cuò)誤。
相關(guān)語(yǔ)句如下:
def download_zip(url: str) -> ZipFile: r = requests.get(url) z = ZipFile(io.BytesIO(r.content)) return z
def _download_corpus() -> Tuple[str, str, str]: z = util.download_zip(CDR_URL) train = z.read(str(Path(PARENT_DIR) / TRAIN_FILENAME)).decode() valid = z.read(str(Path(PARENT_DIR) / VALID_FILENAME)).decode() test = z.read(str(Path(PARENT_DIR) / TEST_FILENAME)).decode() return train, valid, test
上述代碼中,文件路徑是通過(guò)Path()函數(shù)進(jìn)行拼接,所生成的路徑是由“\”分隔的,比如:
CDR_Data\CDR.Corpus.v010516\CDR_TrainingSet.PubTator.txt
可能是由于在不同平臺(tái)上的編碼格式有差異導(dǎo)致這在Windows上識(shí)別出錯(cuò)。
解決方法
棄用原有的路徑拼接函數(shù)Path(),路徑直接改成‘/’拼接,比如:
def _download_corpus() -> Tuple[str, str, str]: z = util.download_zip(CDR_URL) train = z.read('CDR_Data/CDR.Corpus.v010516/CDR_TrainingSet.PubTator.txt').decode() valid = z.read('CDR_Data/CDR.Corpus.v010516/CDR_DevelopmentSet.PubTator.txt').decode() test = z.read('CDR_Data/CDR.Corpus.v010516/CDR_TestSet.PubTator.txt').decode() return train, valid, test

浙公網(wǎng)安備 33010602011771號(hào)