Machine Learning for Cryptographic Algorithm Identification

Flávio Barbosa, Arthur Vidal, Flávio Mello

Abstract


This paper aims to study encrypted text files in order to identify their encoding algorithm. Plain texts were encoded with distinct cryptographic algorithms and then some metadata were extracted from these codifications. Afterward, the algorithm identification is obtained by using data mining techniques. Firstly, texts in Portuguese, English and Spanish were encrypted using DES, Blowfish, RSA, and RC4 algorithms. Secondly, the encrypted files were submitted to data mining techniques such as J48, FT, PART, Complement Naive Bayes, and Multilayer Perceptron classifiers. Charts were created using the confusion matrices generated in step two and it was possible to perceive that the percentage of identification for each of the algorithms is greater than a probabilistic bid. There are several scenarios where algorithm identification reaches almost 97, 23% of correctness.

Keywords


Cryptographic Algorithm Identification; Data Mining; Machine Intelligence

Full Text:

PDF

References


Kahate, A. Cryptography and Network Security, 3rd ed, Nova Deli,

McGraw Hill Education, 2013.

Tanenbaum, A. Computer Network, 5th edition, Boston, Pearson, 2011.

Pfleeger, C. P. and Pfleeger, S. L. Security in Computing, Boston, Prentice

Hall, 2006.

Schneier, B. Fast Software Encryption, Cambridge Security Workshop

Proceedings, pp. 191-204, 1994.

[10] Nie, T., Song, C., Zhi, X. Performance Evaluation of DES and Blow-

fish Algorithms, International Conference on Biomedical Engineering and

Computer Science (ICBECS), pp. 1-4, Wuhan, 2010.

Verma, O. P., Agarwal, R., Dafouti, D., Tyagi, S. Performance Analysis Of

Data Encryption Algorithms, 3rd International Conference on Electronics

Computer Technology (ICECT), pp. 399-403, Kanyakumari, 2011.

Poonia, V., Yadav, N. S. Analysis of modified Blowfish Algorithm in

different cases with various parameters, International Conference on

Advanced Computing and Communication Systems, pp. 1-5, Coimbatore,

Hammood, M. M., Yoshigoe, K., Sagheer, A. M. RC4-2S: RC4 Stream

Cipher with Two State Tables, Information Technology Convergence, v.

, pp. 13-20, 2013.

Paul, G., Maitra, S. RC4 Stream Cypher and Its Variants. Boston, CRC

Press, 2012.

Vanhoef, M., Piessens, F. All Your Biases Belong To Us: Breaking RC4

in WPA-TKIP and TLS, Proceedings of the 24th USENIX Conference on

Security Symposium, pp. 12-14, Washington, 2015.

Fluhrer, S., Mantin, I., Shamir, A. Weakness in the Key Scheduling

Algorithm of RC4, Selected Areas of Cryptography, v. 2259, pp. 1-24,

Mantin, I., Shamir, A. A Pratical Attack on Broadcast RC4, Fast

Software Encryption, v. 2355, pp. 152-164, 2002.

Navega, S., Princípios Essenciais do Data Mining, Anais de Infoimagem,

Cenadem, 2002.

Han, J., Kamber, M., Pei, J. Data Mining Concepts and Techniques, 3rd

edition, Morgan Kaufmann, Waltham, 2011.

, Coutinho, C. S. Números Inteiros e Criptografia RSA, IMPA, Rio de

Janeiro, 2003.

Das, A., Madhavan, C. E. V. Public-key Cryptography Theory and

Practice, Deli, Pearson, 2009.

Ren, W., Miao, Z. A Hybrid Algorithm Based on DES and RSA in

Bluetooth Comunication, Second International Conference on Modeling,

Simulation and Visualization Methods (WMSVM), pp. 221-225, Sanya,

Anane, N., Anane, M., Bessalah, H., Issad, M., Messaoudi, K. RSA

Based Encryption Decryption of Medical Images, 7th International MultiConference

on Systems Signals and Devices (SSD), pp. 1-4, 2010.

Goutam, P., Subhamoy, M. RC4 State Information at Any Stage Reveals

the Secret Key, IACR Cryptology ePrint Archive, 2007.

Witten, I. H., Frank, E., Hall, M. A. Data Mining Practical Machine

Learning Tools and Techniques, 3rd edition, Morgan Kaufmann, Burlington,

Gupta, S., Chattopadhyay, A., Sinha, K., Maitra, S., Sinha B. Highperformance

hardware implementation for RC4 stream cipher, IEEE

Transaction Computers, v. 62(4), pp. 730-743, 2013.

Mohamed, W. N. H. W., Sallen, M. N. M., Omar, A. H. A Comparative

Study of Reduced Error Pruning Method in Decision Tree Algorithms,

IEEE International Conference of Control System, Computing and Engineering,

Penang, pp. 23-25, 2012.

Frank, E., Witten, I. Generating Accurate Rule Sets Without Global

Optimization, Proceedings of the Fifteenth International Conference on

Machine Learning, pp. 144-151, São Francisco, 1998.

Gama, J. Functional Trees, Machine Learning, v. 55(3), pp. 219-250,

Rennie, J. D. M., Shih, L., Teevan, J., Karger, D. R. Tackling the Poor

Assumpltions of Naive Bayes Text Classifiers Proceedins of the Twentieth

International Conference on Machine Learning, Whasington DC, 2003.

Silva, L. N. C. Análise e Síntese de Estratégias de Aprendizado Para

Redes Neurais Artificiais Projeto de Mestrado, Universidade Estadual de

Campinas, Setembro de 1998.

Reimão, A. S. F. V. Análise de blocos de arquivos criptografados para

obtenção do algoritmo, Projeto de Graduação, Universidade Federal do

Rio de Janeiro, Fevereiro 2015.




DOI: https://doi.org/10.17648/enig.v3i1.55

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Licença Creative Commons
This site is licensed with the Creative Commons Atribuição-NãoComercial-SemDerivações 4.0 Internacional

RENASIC Logo1 Logo2 Logo3