NeoPI

From Embedded Lab Vienna for IoT & Security
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Summary

NeoPI is a Python script designed to use a selection of statistical methods to detect encrypted and purposefully opaque code within a web servers files and scripts. NeoPI was developed by the Cisco CX Security Labs for the purpose of detection of well hidden web shells. It was created to be used in conjunction with other already existing malware detection tools such as Linux Malware Detect and traditional keyword based search methods.

Installation and Usage

  • Operating system: Any device running Windows or Linux supporting Python 2.6 or greater
"git clone ssh://git@github.com:Neohapsis/NeoPI.git" 

NeoPI recursively scans through files based on a given base directory and will rank files based on a number of statistical tests. A general score for the entire base directory will be calculated based on the results of the individual tests. Though false negatives are possible, NeoPI is intended for use as a guideline for administrators, not a foolproof detection solution.

Description

Running NeoPI

Change into the directory that NeoPI is saved in and use the -h flag to see the options available:

./neopi.py -h

Options

-C FILECSV, --csv=FILECSV

generates an CSV output file containing the results of the scan.

-e, --entropy

can be set to run only the entropy test. The entropy test measures the so-called “Shannon entropy” of a file. Shannon entropy is the measure of the information contained in a message, usually specified in bits. This test calculates the minimum amount of bytes required to encode the file. The measuring of this entropy can be useful in locating encrypted shellcode. Encryption can often lead to an increase in the amount of entropy in a file.

-l, --longestword

can be set to run only the longest word test. The longest word test is a simple test that finds the longest uninterrupted string within a file. Typical files and scripts within a web server will mostly be made up of relatively short words. Since many web shells are encrypted and intentionally obfuscated, their source code will often contain long strings of encoded text. For example, base64_encoding can produce long strings of characters without spaces, triggering this test. Unusually long strings within files could hint at a web shell.

-c, --ic

can be set to run only the Index of Coincidence test. The Index of Coincidence test calculates the occurrence of certain letter combinations compared to a text sample in which all letters are evenly distributed. The value returned by this test is generally consistent for different types of text; either by spoken or typed word or by programming or scripting language. If a file's value significantly differs from the expected value for the expected text type, it could indicate the presence of encoded or encrypted text in a file.

-am, --all

This command runs all tests, including the entropy test, the longest word test, and the Index of Coincidence test. To most accurately find possible web shells, it is suggested to run this command. NeoPI documentation suggests a manual review of all files listed in the Highest Ranked Files at a minimum. Reviews of files showing up in single tests are recommended as well, since some tests are more effective at detecting certain web shells than others.

Results

A full scan will be split into four separate sections, listing the highest ranking files of each test that NeoPI runs.

  • The highest ranking files according to the IC test are ordered by the lowest IC value of all files. An average IC value for all files is also given.
  • For the entropy test, files are given a value between 0 and 8, where a higher value represents a higher Shannon entropy. Higher entropy values can point to encrypted or obfuscated files.
  • The longest word file lists the longest word contained in each file. A unusually long word could hint at the file being encoded with an algorithm such as base64_encode.
  • Finally, files are ranked according to a combination of scores received in all tests. Files are ranked in percent.

References