src.preprocessing.cluster_documents

Compute the membership vectors for each cluster.

Classes

ClusterDocuments([periods])

Cluster the documents based on the event and the date similarity.

class src.preprocessing.cluster_documents.ClusterDocuments(periods=4)[source]

Cluster the documents based on the event and the date similarity.

Param:

periods: how many time periods to consider.

periods: int = 4
custom_transform(data, **transform_args)[source]

Cluster the documents based on the event and the date similarity.

Parameters:
  • data (DataFrame) – The data to transform.

  • transform_args (Never) – [UNUSED] Additional keyword arguments.

Return type:

DataFrame

Returns:

The transformed data.