Commit c0a6fc65 by Jaime Collado

INSTALL and README updated

parent 6e5a4700
# Installation instructions # Installation instructions
In order to make use of this library, install it as follows:
1. Clone this repository: `git clone https://gitlab.ujaen.es/ammr0032/TextAnalysisSpacy.git` 1. Clone this repository: `git clone https://gitlab.ujaen.es/ammr0032/TextAnalysisSpacy.git`
2. Use this inside the project's folder: `python -m pip install .` 2. Use this inside the project's folder: `python -m pip install .`
This library has been tested in Python 3.8+
\ No newline at end of file
...@@ -7,13 +7,13 @@ This class provides methods for the calculation of different metrics on text. It ...@@ -7,13 +7,13 @@ This class provides methods for the calculation of different metrics on text. It
- [INSTALL.md](INSTALL.md): A guide to make this project work on your local environment. - [INSTALL.md](INSTALL.md): A guide to make this project work on your local environment.
## ./src/texty ## ./src/texty
- [analyzer.py](src/texty/analyzer.py): This class provides methods for the calculation of different metrics on text. - [analyzer.py](src/texty/analyzer.py): This module provides a class with methods for the calculation of different metrics on text.
- [complexity.py](src/texty/complexity.py): This class provides methods for the calculation of different complexity metrics on text. - [complexity.py](src/texty/complexity.py): This module provides a class methods for the calculation of different complexity metrics on text.
- [CREA_total.txt](CREA_total.txt): A dataset of 737799 spanish words ordered by its absolute frequency. - [CREA_total.txt](CREA_total.txt): A dataset of 737799 spanish words ordered by its absolute frequency.
- [analyze_complexity.py](src/texty/analyze_complexity.py): A script that takes a .txt file and an output format as input and generates a file containing all metrics as calculated by the ComplexityAnalyzer class. - [analyze_complexity.py](src/texty/analyze_complexity.py): Script that takes a .txt file and an output format as input and generates a file containing all metrics as calculated by the ComplexityAnalyzer class.
## ./examples ## ./examples
- [example_text.txt](examples/example_text.txt): A simple .txt file to test the library. - [example_text.txt](examples/example_text.txt): Simple .txt file to test the library.
- [example.ipynb](examples/example.ipynb): Colab notebook that shows how to use the ComplexityAnalyzer class. - [example.ipynb](examples/example.ipynb): Colab notebook that shows how to use the ComplexityAnalyzer class.
...@@ -34,3 +34,11 @@ In this section, we introduce the different metrics offered in this Python libra ...@@ -34,3 +34,11 @@ In this section, we introduce the different metrics offered in this Python libra
* **Feature selection**: Remove features with low variance and SelectFromModel (Selection of functions based on L1) * **Feature selection**: Remove features with low variance and SelectFromModel (Selection of functions based on L1)
* **kBest**: Selection of the k best features * **kBest**: Selection of the k best features
# Usage
You can run _Texty_ from terminal as follows:
`analyze-complexity {text_file.txt} [-o output_format (csv, tsv or json)]`
...@@ -25,9 +25,6 @@ def analyze_complexity(args=None): ...@@ -25,9 +25,6 @@ def analyze_complexity(args=None):
exit() exit()
# Instantiate the ComplexityAnalyzer class # Instantiate the ComplexityAnalyzer class
try:
nlp = spacy.load("es_core_news_sm")
except:
spacy.cli.download("es_core_news_sm") spacy.cli.download("es_core_news_sm")
nlp = spacy.load("es_core_news_sm") nlp = spacy.load("es_core_news_sm")
complexity_analyzer = ComplexityAnalyzer("es", nlp) complexity_analyzer = ComplexityAnalyzer("es", nlp)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or sign in to comment