Skip to content
Snippets Groups Projects
Commit 7c49a475 authored by DETCHART Jonathan's avatar DETCHART Jonathan
Browse files

set optional part

parent d3c20bc8
Branches
No related tags found
No related merge requests found
......@@ -14,7 +14,7 @@ From a programming point of view, the first objectives of this project are to le
- [_Q3. Influence of the text length on the quality of detection_](#q3-influence-of-the-text-length-on-the-quality-of-detection-toc)
- [_Q3.1._](#q31-toc)
- [_Q3.2._](#q32-toc)
- [_Q4. Compression with Huffman_](#q4-compression-with-huffman-toc)
- [_Q4. Compression with Huffman_](#q4-compression-with-huffman-toc-optional)
- [_Upload your work to the LMS_](#upload-your-work-to-the-lms-toc)
![Languages](./assets/languages.jpg)
......@@ -119,7 +119,7 @@ For each text length, evaluate the probability of good detection and plot the re
##### Q3.2. [[toc](#table-of-content)]
To try improve this result, you can try to capture the redundancy of a language in the sequences of 2 (or more) letters. For that, consider the pairs of consecutive letters as a symbol (of course, the size of your alphabet of sybols will increase) and do the same analysis to check if you can improve the detection results.
## Q4. Compression with Huffman [[toc](#table-of-content)]
## Q4. Compression with Huffman [[toc](#table-of-content)] (_optional_)
:file_folder: Create a new file `src/data_compression.py`.
The analysis of the occurrence frequencies of the characters in the languages shows that there is a strong variability between the characters. This shows that there is some "redundancy" in the languages. In other words, this means that it is possible to compress the languages by using these occurrence frequencies.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment