BERN2 Documentation
In biomedical natural language processing, named entity recognition (NER) and named entity normalization (NEN) are key tasks that enable the automatic extraction of biomedical entities (e.g., diseases and chemicals) from the ever-growing biomedical literature.
We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool (Kim et al., 2019) by employing a multi-task NER model and neural network-based NEN models to achieve much faster and more accurate inference. See our paper for more details.
Plain Text as Input
HTTP method
POSTRequest URL
http://bern2.korea.ac.kr/plain
Request Body
{ "text":"Autophagy maintains tumour growth through circulating arginine." }
Response
If the input text is annotated successfully, the response is a200 OK
status code. The response body contains a JSON representation of the annotations.
{ "annotations": [ { "id": [ "MESH:D009369" ], "is_neural_normalized": false, "prob": 0.9999922513961792, "mention": "tumour", "obj": "disease", "span": { "begin": 20, "end": 26 } }, { "id": [ "MESH:D001120" ], "is_neural_normalized": false, "prob": 0.9819278717041016, "mention": "arginine", "obj": "drug", "span": { "begin": 54, "end": 62 } } ], "text": "Autophagy maintains tumour growth through circulating arginine.", "timestamp": "Thu Dec 23 04:12:28 +0000 2021" }
Curl command
$ curl -d '{"text":"Autophagy maintains tumour growth through circulating arginine."}' \ -H "Content-Type: application/json" \ -X POST http://bern2.korea.ac.kr/plain
Python example
import requests def query_plain(text, url="http://bern2.korea.ac.kr/plain"): return requests.post(url, json={'text': text}).json() if __name__ == '__main__': text = "Autophagy maintains tumour growth through circulating arginine." print(query_plain(text))
PubMed ID (PMID) as Input
HTTP method
GETRequest URL
http://bern2.korea.ac.kr/pubmed/30429607,29446767
Request Body
NoneResponse
If the PubMed articles are annotated successfully, the response is a200 OK
status code. The response body contains a JSON representation of the annotations.
[ { "pmid": "30429607", "annotations": [ { "id": [ "MESH:D009369" ], "is_neural_normalized": false, "prob": 0.9999922513961792, "mention": "tumour", "obj": "disease", "span": { "begin": 20, "end": 26 } }, ... ], "text": "Autophagy maintains tumour growth through circulating arginine. ...", "timestamp": "Thu Dec 23 05:17:50 +0000 2021" }, { "pmid": "29446767", "annotations": [ { "id": [ "MESH:C567763" ], "is_neural_normalized": false, "prob": 0.9999992847442627, "mention": "CLAPO syndrome", "obj": "disease", "span": { "begin": 0, "end": 14 } }, ... ], "text": "CLAPO syndrome: identification of somatic activating PIK3CA mutations and PURPOSE: CLAPO syndrome is a rare vascular disorder characterized by capillary malformation of the lower lip, lymphatic malformation predominant on the face and neck, asymmetry, and partial/generalized overgrowth. ...", "timestamp": "Thu Dec 23 05:17:51 +0000 2021" } ]
Curl command
$ curl -H "Content-Type: application/json" \ -X GET http://bern2.korea.ac.kr/pubmed/30429607,29446767
Python example
import requests def query_pmid(pmids, url="http://bern2.korea.ac.kr/pubmed"): return requests.get(url + "/" + ",".join(pmids)).json() if __name__ == '__main__': pmids = ["30429607", "29446767"] print(query_pmid(pmids))
Installing BERN2
You first need to install BERN2 and its dependencies.
# Install torch with conda (please check your CUDA version) conda create -n bern2 python=3.7 conda activate bern2 conda install pytorch==1.9.0 cudatoolkit=10.2 -c pytorch conda install faiss-gpu libfaiss-avx2 -c conda-forge # Check if cuda is available python -c "import torch;print(torch.cuda.is_available())" # Install BERN2 git clone git@github.com:mjeensung/bern2.git cd bern2 pip install -r requirements.txt
(Optional) If you want to use mongodb as a caching database, you need to install and run it.
# https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/#install-mongodb-community-edition-using-deb-packages sudo systemctl start mongod sudo systemctl status mongod
Then, you need to download resources (e.g., external modules or dictionaries) for running BERN2.
Note that you will need 70GB of free disk space.
wget http://nlp.dmis.korea.edu/projects/bern2-sung-et-al-2022/resources_v1.1.b.tar.gz tar -zxvf resources_v1.1.b.tar.gz rm -rf resources_v1.1.b.tar.gz # (For Linux/MacOS Users) install CRF cd resources/GNormPlusJava tar -zxvf CRF++-0.58.tar.gz mv CRF++-0.58 CRF cd CRF ./configure --prefix="$HOME" make make install cd ../../.. # (For Windows Users) install CRF cd resources/GNormPlusJava unzip -zxvf CRF++-0.58.zip mv CRF++-0.58 CRF cd ../..
Running BERN2
The following command runs BERN2.
export CUDA_VISIBLE_DEVICES=0 cd scripts bash run_bern2.sh
(Optional) To restart BERN2, you need to run the following commands.
export CUDA_VISIBLE_DEVICES=0 cd scripts bash stop_bern2.sh bash start_bern2.sh
Use BERN2
After successfully running BERN2 in your local environment, you can access it via RESTful API.
Except for the url (use http://localhost:8888 instead of http://bern2.korea.ac.kr), the usage of the local installation is exactly the same as that of the web service.
Plain Text as Input
import requests def query_plain(text, url="http://localhost:8888/plain"): return requests.post(url, json={'text': text}).json() if __name__ == '__main__': text = "Autophagy maintains tumour growth through circulating arginine." print(query_plain(text))
PubMed ID (PMID) as Input
import requests def query_pmid(pmids, url="http://localhost:8888/pubmed"): return requests.get(url + "/" + ",".join(pmids)).json() if __name__ == '__main__': pmids = ["30429607", "29446767"] print(query_pmid(pmids))
If you have any questions or have found a bug, please contact mujeensung@korea.ac.kr or minbyuljeong@korea.ac.kr