Sunday, October 21, 2018

Relation Triples Extraction with Stanford OpenIE


Setting up Stanford-OpenIE for Relation Triples:
For extracting relation triples from a sentence, we can use the unofficial cross-platform Python wrapper for the state-of-art information extraction library from Stanford University.

More details on: https://github.com/philipperemy/Stanford-OpenIE-Python.git

First, we can clone the project.

git clone https://github.com/philipperemy/Stanford-OpenIE-Python.git

Then, I had to remove "-" from folder name to import the folder as a module.

mv
Stanford-OpenIE-Python/ StanfordOpenIEPython/

cd StanfordOpenIEPython/



To use the wrapper as module, we create __init__.py file.

touch __init__.py 

Example code:

from StanfordOpenIEPython.main import stanford_ie
import argparse
import os

FILE_PATH = 'StanfordOpenIEPython/sentences.txt'

def triples_extractor(sentence):
    try:
        os.remove(FILE_PATH)
    except:
        pass
    with open(FILE_PATH, 'a') as text:
        text.write(sentence)
    triples_raw = stanford_ie('sentences.txt', verbose=True)
    triples = [[trip.lstrip() for trip in triple] for triple in triples_raw]
    return triples


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("-s", "--sentence", default='Bill Gates is married to Melinda Gates.')
    args = parser.parse_args()
    sentence = args.sentence
    print triples_extractor(sentence)



Output:
[['Bill Gates', 'is', 'married'], ['Bill Gates', 'is married to', 'Melinda Gates']]
 

More on: https://github.com/apogre/python_ner/blob/master/triples.py

No comments:

Post a Comment