src.preprocessing.extract_dates_regex¶
Extract the creation dates from the full text of documents.
Classes
|
Extract the creation date of a body of text with a regex. |
- class src.preprocessing.extract_dates_regex.ExtractDatesRegex(min_date, max_date)[source]¶
Extract the creation date of a body of text with a regex.
The regex tries to match all dates in the format of day-month-year and day {monthname} year. The month name can be in Dutch or English.
min_date: The minimum date to consider in format %d-%m-%Y. max_date: The maximum date to consider in format %d-%m-%Y.
-
min_date:
InitVar
¶
-
max_date:
InitVar
¶
-
min_date: