Skip to content
Toggle navigation
P
Projects
G
Groups
S
Snippets
Help
Jaime Collado
/
textflow
This project
Loading...
Sign in
Toggle navigation
Go to a project
Project
Repository
Issues
1
Merge Requests
0
Pipelines
Wiki
Snippets
Settings
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Commit
f0547ed3
authored
Jul 18, 2022
by
Estrella Vallecillo
Browse files
Options
_('Browse Files')
Download
Email Patches
Plain Diff
Fixing ComplexityAnalyzer and NGramsAnalyzer
parent
02dc279b
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
23 additions
and
14 deletions
textflow/ComplexityAnalyzer.py
textflow/NGramsAnalyzer.py
textflow/ComplexityAnalyzer.py
View file @
f0547ed3
...
...
@@ -207,9 +207,14 @@ class ComplexityAnalyzer(Analyzer):
avgLettersWords
=
numLetters
/
self
.
numWords
listLenLetters
=
np
.
array
(
listLenLetters
)
self
.
poliniComprensibility
=
95.2
-
(
9.7
*
avgLettersWords
)
-
((
0.35
*
self
.
numWords
)
/
self
.
numSentences
)
self
.
muLegibility
=
(
self
.
numWords
/
(
self
.
numWords
-
1
))
*
(
avgLettersWords
/
listLenLetters
.
var
())
*
100
if
self
.
numSentences
==
0
:
self
.
poliniComprensibility
=
95.2
-
(
9.7
*
avgLettersWords
)
-
((
0.35
*
self
.
numWords
)
/
1
)
else
:
self
.
poliniComprensibility
=
95.2
-
(
9.7
*
avgLettersWords
)
-
((
0.35
*
self
.
numWords
)
/
self
.
numSentences
)
if
self
.
numWords
<
2
:
self
.
muLegibility
=
0
else
:
self
.
muLegibility
=
(
self
.
numWords
/
(
self
.
numWords
-
1
))
*
(
avgLettersWords
/
listLenLetters
.
var
())
*
100
def
lexicalIndex
(
self
):
"""
...
...
textflow/NGramsAnalyzer.py
View file @
f0547ed3
...
...
@@ -70,15 +70,19 @@ class NGramsAnalyzer(Analyzer):
Args:
text: a string/text to analyze
"""
vect
=
sklearn
.
feature_extraction
.
text
.
CountVectorizer
(
ngram_range
=
(
self
.
ngramsSize
,
self
.
ngramsSize
),
tokenizer
=
self
.
tokenizer
.
tokenize
,
stop_words
=
self
.
stopwords
)
text
=
[
text
]
vect
.
fit
(
text
)
self
.
listOfNGrams
=
vect
.
get_feature_names_out
()
.
tolist
()
dicfreq
=
{}
for
i
in
self
.
listOfNGrams
:
if
i
in
dicfreq
:
dicfreq
[
i
]
+=
1
else
:
dicfreq
[
i
]
=
1
self
.
freqNGrams
=
dicfreq
try
:
vect
=
sklearn
.
feature_extraction
.
text
.
CountVectorizer
(
ngram_range
=
(
self
.
ngramsSize
,
self
.
ngramsSize
),
tokenizer
=
self
.
tokenizer
.
tokenize
,
stop_words
=
self
.
stopwords
)
text
=
[
text
]
vect
.
fit
(
text
)
self
.
listOfNGrams
=
vect
.
get_feature_names_out
()
.
tolist
()
dicfreq
=
{}
for
i
in
self
.
listOfNGrams
:
if
i
in
dicfreq
:
dicfreq
[
i
]
+=
1
else
:
dicfreq
[
i
]
=
1
self
.
freqNGrams
=
dicfreq
except
Exception
:
self
.
listOfNGrams
=
[]
self
.
freqNGrams
=
{}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment