- SpringerCsv2Bib
- GetAbstract
- BibFilesMerge
Run this command to install the dependencies:
- pip install -r requirements.txt
Convert Springer CSV file to Bibtext file
foo@bar:~$ python SpringerCsv2Bib.py -h
usage: SpringerCsv2Bib.py [-h] -c CSVFILENAME -b BIBFILENAME
optional arguments:
-h, --help show this help message and exit
-c CSVFILENAME, --csvFileName CSVFILENAME
CSV file name
-b BIBFILENAME, --bibFileName BIBFILENAME
BibText file name
foo@bar:~$ python SpringerCsv2Bib.py -c "Springer.csv" -b "Springer.bib"
File founded: Springer.csv
Processed: 590
Removid without author: 5
Total Final: 585
Saved file: Springer.bib
This tools get abstract on digital library, its a Magver function, not official.
Obs 1: ACM need use limit parameter, because ACM blocks if you get many abstracts same time.
Obs 2: In my case, I use proxy, because I access in my house by proxy of my university.
foo@bar:~$ python GetAbstract.py -h
usage: GetAbstract.py [-h] -d {springer,acm,ieee} -f BIBFILENAME [-p PROXY]
[-l LIMIT]
optional arguments:
-h, --help show this help message and exit
-d {springer,acm,ieee}, --database {springer,acm,ieee}
select database
-f BIBFILENAME, --bibFileName BIBFILENAME
Springer bibFile name
-p PROXY, --proxy PROXY
internet proxy, ex:
https://john:[email protected]:4001
-l LIMIT, --limit LIMIT
abstract load limit
foo@bar:~$ python GetAbstract.py -d acm -f "ACM.bib" -l 10 -p https://peter:[email protected]:4001
Had Abstract: 85
Url errors: 0
Loaded Abstract: 10
Total Entries: 585
Limit to process: 10
Processed: 95
Left: 490
or
foo@bar:~$ python GetAbstract.py -d springer -f "Springer.bib"
Had Abstract: 85
Url errors: 0
Loaded Abstract: 10
Total Entries: 585
Limit to process: 10
Processed: 95
Left: 490
Merge BibTex files and:
- remove duplicate entries
- in some cases merge information before removing duplicates
- remove entries that not have:
- author or
- title or
- year or
- journal name or conference name
This tool has been tested with these digital library files:
- ACM Digital Library
- IEEE Xplore
- Scopus
- SpringerLink
- ScienceDirect - ElsevierWeb of Science
- Web of Science (thanks @dineiar for this)
foo@bar:~$ python BibFilesMerge.py -h
usage: BibFilesMerge.py [-h] -p FOLDERPATH [-f [FILELIST [FILELIST ...]]]
[-o FILENAMEOUT] [-e [EXCLUDE [EXCLUDE ...]]] [-l]
optional arguments:
-h, --help show this help message and exit
-p FOLDERPATH, --folderPath FOLDERPATH
Bib files folder path
-f [FILELIST [FILELIST ...]], --fileList [FILELIST [FILELIST ...]]
bib file name list, e.g. -f IEEE.bib ACM.bib
science.bib Springer.bib
-o FILENAMEOUT, --fileNameOut FILENAMEOUT
File name of merged file
-e [EXCLUDE [EXCLUDE ...]], --exclude [EXCLUDE [EXCLUDE ...]]
bib with entries to be removed from others, e.g. -e
FirstExecution.bib SecondExecution.bib
-l, --logProcess Log processing to CSV files
foo@bar:~$ python BibFilesMerge.py -p output/ -o 2019-2.bib -f 2019-2/ScienceDirect1.bib 2019-2/ScienceDirect2.bib 2019-2/Scopus.bib -e 2018/ACM.bib 2018/IEEE.bib 2018/ScienceDirect.bib 2018/SCOPUS.bib 2019/ACM.bib 2019/IEEE.bib 2019/ScienceDirect1.bib 2019/ScienceDirect2.bib 2019/SCOPUS.bib -l
--folderPath output/
--fileNameOut 2019-2.bib
--fileList ['2019-2/ScienceDirect1.bib', '2019-2/ScienceDirect2.bib', '2019-2/Scopus.bib']
--exclude ['2018/ACM.bib', '2018/IEEE.bib', '2018/ScienceDirect.bib', '2018/SCOPUS.bib', '2019/ACM.bib', '2019/IEEE.bib', '2019/ScienceDirect1.bib', '2019/ScienceDirect2.bib', '2019/SCOPUS.bib']
--logProcess True
2019-2/ScienceDirect1.bib
2019-2/ScienceDirect2.bib
2019-2/Scopus.bib
Total: 798
No Author: 0
No Year: 0
No Publisher: 0
Duplicates: 31
Merged: 25
Excluded from bib: 537
Final: 230
without Abstract: 0 {'2019-2/ScienceDirect1.bib': 0, '2019-2/ScienceDirect2.bib': 0, '2019-2/Scopus.bib': 0}
The two CSV files created on output
folder by the -l
switch are:
BibFilesMerge_removed.csv
, with columns cause, source, key, doi, author, year, title and publish- cause is one of: no author, no year, no journal, duplicate of next or duplicate of prev
BibFilesMerge_final.csv
, with columns key, doi, author, year, title, publish and abstract
Sometimes errors occur while reading the bib file. In this case, note at the end of the error line of the bib file. Then edit the bib file and adjust the error. For example:
foo@bar:~$ python BibFilesMerge.py -p results -f IEEE.bib ACM.bib science.bib Springer.bib -o MyFile.bib
--folderPath results
--fileNameOut MyFile.bib
--fileList ['IEEE.bib', 'ACM.bib', 'science.bib', 'Springer.bib']
--exclude None
--logProcess False
IEEE.bib
ACM.bib
science.bib
Traceback (most recent call last):
File "BibFilesMerge.py", line 146, in <module>
run(args["folderPath"], args["fileList"], args["fileNameOut"])
File "BibFilesMerge.py", line 63, in run
bibData = parse_file(os.path.join(folderPath,bibFileName))
File "./pybtex\pybtex\database\__init__.py", line 865, in parse_file
return parser.parse_file(file)
File "./pybtex\pybtex\database\input\__init__.py", line 54, in parse_file
self.parse_stream(f)
File "./pybtex\pybtex\database\input\bibtex.py", line 410, in parse_stream
return self.parse_string(text)
File "./pybtex\pybtex\database\input\bibtex.py", line 397, in parse_string
for entry in entry_iterator:
File "./pybtex\pybtex\database\input\bibtex.py", line 191, in parse_bibliography
self.handle_error(error)
File "./pybtex\pybtex\database\input\bibtex.py", line 383, in handle_error
report_error(error)
File "./pybtex\pybtex\errors.py", line 78, in report_error
raise exception
File "./pybtex\pybtex\database\input\bibtex.py", line 189, in parse_bibliography
yield tuple(self.parse_command())
File "./pybtex\pybtex\database\input\bibtex.py", line 222, in parse_command
self.handle_error(error)
File "./pybtex\pybtex\database\input\bibtex.py", line 383, in handle_error
report_error(error)
File "./pybtex\pybtex\errors.py", line 78, in report_error
raise exception
File "./pybtex\pybtex\database\input\bibtex.py", line 220, in parse_command
self.required([body_end])
File "./pybtex\pybtex\scanner.py", line 120, in required
raise TokenRequired(description, self)
pybtex.scanner.TokenRequired: syntax error in line 2264: '}' expected
Bib file Content in line 2264:
2264 note = "Special issue on Assistive Computer Vision and Robotics - "Assistive Solutions for Mobility, Communication and HMI" ",
just fix it to:
2264 note = "Special issue on Assistive Computer Vision and Robotics - Assistive Solutions for Mobility, Communication and HMI",