negspacy package¶
Submodules¶
negspacy.negation module¶
-
class
negspacy.negation.Negex(nlp, language='en', ent_types=[], psuedo_negations=[], preceding_negations=[], following_negations=[], termination=[], chunk_prefix=[])¶ Bases:
objectA spaCy pipeline component which identifies negated tokens in text.
Based on: NegEx - A Simple Algorithm for Identifying Negated Findings and Diseasesin Discharge Summaries
Chapman, Bridewell, Hanbury, Cooper, Buchanan
- Parameters
nlp (object) – spaCy language object
ent_types (list) – list of entity types to negate
language (str) – language code, if using default termsets (e.g. “en” for english)
psuedo_negations (list) – list of phrases that cancel out a negation, if empty, defaults are used
preceding_negations (list) – negations that appear before an entity, if empty, defaults are used
following_negations (list) – negations that appear after an entity, if empty, defaults are used
termination (list) – phrases that “terminate” a sentence for processing purposes such as “but”. If empty, defaults are used
-
get_patterns()¶ returns phrase patterns used for various negation dictionaries
- Returns
patterns – pattern_type: [patterns]
- Return type
dict
-
negex(doc)¶ Negates entities of interest
- Parameters
doc (object) – spaCy Doc object
-
process_negations(doc)¶ Find negations in doc and clean candidate negations to remove pseudo negations
- Parameters
doc (object) – spaCy Doc object
- Returns
preceding (list) – list of tuples for preceding negations
following (list) – list of tuples for following negations
terminating (list) – list of tuples of terminating phrases
-
termination_boundaries(doc, terminating)¶ Create sub sentences based on terminations found in text.
- Parameters
doc (object) – spaCy Doc object
terminating (list) – list of tuples with (match_id, start, end)
- Returns
boundaries – list of tuples with (start, end) of spans
- Return type
list