Advanced zero-day static analysis engine
Project description
PROTEUS
Advanced zero-day static analysis engine built with Rust and Python
Features • Quick Start • Documentation • Contributing • License
Advanced Zero-Day Static Analysis Engine
Proteus is a high-performance malware analysis tool built with Rust and Python, designed to detect zero-day threats through static analysis, heuristics, and machine learning.
🎯 Features
Core Analysis
- 🔍 PE/ELF Binary Analysis - Deep inspection of Windows and Linux executables
- 📊 Entropy Calculation - Detect packed/encrypted malware (section-level granularity)
- 🧠 Heuristic Scoring - Intelligent threat assessment with configurable thresholds
- 🔤 String Extraction - ASCII and wide string analysis with pattern detection
- 🌐 IOC Detection - Automatic extraction of URLs, IPs, registry keys, file paths
- ⚡ High Performance - Rust-powered core with parallel processing via Rayon
- 📦 Batch Processing - Scan entire directories efficiently
Advanced Features
- 🤖 ML Ready - Feature extraction pipeline for machine learning
- 📈 Feature Engineering - 16+ features including entropy, imports, exports, strings
- 🎯 Detection Metrics - Built-in accuracy, precision, recall tracking
- 🔧 Extensible - Modular architecture for custom analyzers
📊 Detection Metrics (Test Dataset)
| Metric | Value |
|---|---|
| Detection Rate | 100% |
| False Positive Rate | 0% |
| Avg Clean Score | 20.73/100 |
| Avg Malicious Score | 66.00/100 |
🚀 Quick Start
Prerequisites
Installation
# Clone repository
git clone https://github.com/ChronoCoders/proteus.git
cd proteus
# Create virtual environment
python -m venv venv
# Activate (Windows)
venv\Scripts\activate
# Activate (Linux/Mac)
source venv/bin/activate
# Install dependencies
pip install maturin numpy scikit-learn torch
# Build Rust extension
maturin develop --release
Basic Usage
Analyze a single file:
python cli.py file C:\path\to\sample.exe
Analyze with full string extraction:
python cli.py file C:\path\to\sample.exe --strings
String-only analysis:
python cli.py strings C:\path\to\sample.exe
Batch scan directory:
python cli.py dir C:\path\to\samples --output results.json
Building Test Dataset
python test_dataset_builder.py
Training ML Models
python ml_trainer.py
📖 Documentation
Example Output
╔═══════════════════════════════════════╗
║ PROTEUS v0.1.0 ║
║ Zero-Day Static Analysis Engine ║
╚═══════════════════════════════════════╝
[*] Analysis: suspicious.exe
[+] Type: PE
[+] Entropy: 7.85
[+] Threat Score: 66.00/100
[+] Verdict: MALICIOUS
[!] Suspicious Indicators:
- VirtualAlloc
- CreateRemoteThread
- WriteProcessMemory
[*] String Analysis:
[+] Total strings: 342
[+] Encoded strings: 15
[!] URLs (2):
http://malicious-c2.com/payload
https://evil.net/download
[!] Suspicious strings (8):
cmd.exe /c powershell
Disable-WindowsDefender
keylogger.dll
Architecture
proteus/
├── src/ # Rust core engine
│ ├── lib.rs # Module entry point
│ ├── pe_parser.rs # PE file parsing (goblin)
│ ├── elf_parser.rs # ELF file parsing
│ ├── entropy.rs # Shannon entropy calculation
│ ├── heuristics.rs # Threat scoring algorithms
│ ├── string_extractor.rs # String analysis engine
│ └── python_bindings.rs # PyO3 FFI bindings
├── python/ # Python orchestration
│ ├── __init__.py
│ ├── analyzer.py # Main analyzer class
│ └── ml_detector.py # ML model integration
├── cli.py # Command-line interface
├── ml_trainer.py # ML training pipeline
├── test_dataset_builder.py # Dataset generation
├── Cargo.toml # Rust dependencies
└── pyproject.toml # Python dependencies
Feature Extraction
Proteus extracts 16+ features per sample:
Binary Features:
- Global entropy
- Section count
- Max section entropy
- Import count
- Export count
- Suspicious API count
String Features:
- Total strings
- URL count
- IP count
- Registry key count
- Suspicious keyword count
- File path count
- Encoded string count
- Encoded ratio
- Suspicious ratio
Threat Detection Patterns
High Entropy Indicators:
- Entropy > 7.8: Likely packed/encrypted
- Entropy > 7.5: Suspicious compression
- Entropy > 7.2: Elevated entropy
Suspicious APIs (PE):
VirtualAlloc, VirtualProtect, WriteProcessMemory,
CreateRemoteThread, LoadLibrary, GetProcAddress,
WinExec, ShellExecute, URLDownloadToFile,
CreateProcess, OpenProcess, ReadProcessMemory,
SetWindowsHookEx, GetAsyncKeyState, InternetOpen
Suspicious Symbols (ELF):
execve, system, fork, ptrace, mprotect,
mmap, dlopen, socket, bind
Suspicious Keywords (Strings):
cmd, powershell, eval, exec, system, shell,
download, upload, exploit, payload, inject,
keylog, screenshot, webcam, ransomware,
encrypt, bitcoin, miner, bypass, disable
🔬 Development
Build & Test
# Development build
maturin develop
# Release build
maturin develop --release
# Run Rust tests
cargo test
# Run Python tests
python -m pytest
# Code quality checks
cargo clippy
mypy .
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
Code Style
- Rust: Follow
rustfmtandclippyrecommendations - Python: Follow PEP 8, type hints required
- No comments in code (self-documenting code preferred)
- Use latest stable versions of dependencies
🗺️ Roadmap
v0.2.0 (Planned)
- YARA rule engine integration
- Advanced packer detection (UPX, ASPack, Themida)
- Digital signature validation
- PE resource section analysis
- Improved ML models with larger datasets
v0.3.0 (Future)
- HTML report generation
- REST API server
- Web dashboard
- Real-time monitoring
- PCAP analysis integration
- Behavior monitoring (dynamic analysis)
📊 Performance
Benchmarks (Intel i7, 16GB RAM):
- Single file analysis: ~50ms
- Batch processing (100 files): ~3 seconds
- String extraction: ~20ms
- ML prediction: ~5ms
⚠️ Limitations
Current Version (v0.1.0):
- Test dataset uses synthetic malware samples
- ML models trained on limited data
- No dynamic analysis capabilities
- Windows-focused (PE analysis more mature than ELF)
Recommended Use:
- Educational purposes
- Research projects
- Proof-of-concept deployments
- Static analysis component in larger systems
🔒 Security & Legal
Important Notes:
- Always analyze malware in isolated environments (VMs/sandboxes)
- Do not use on production systems without proper testing
- Obey local laws regarding malware possession and analysis
- This tool is for educational and research purposes only
Disclaimer: The authors are not responsible for misuse of this tool. Users are solely responsible for ensuring their usage complies with applicable laws and regulations.
📝 License
MIT License - see LICENSE file for details
Copyright (c) 2025 ChronoCoders
👥 Authors
ChronoCoders Team
- Advanced static analysis engine
- ML integration
- Performance optimization
🙏 Acknowledgments
- goblin - Excellent binary parsing library
- PyO3 - Seamless Rust-Python integration
- Rayon - Parallel processing made easy
- scikit-learn - ML algorithms
📚 Additional Resources
⭐ If you find Proteus useful, please star the repository!
🐛 Found a bug? Open an issue
💡 Have a feature request? Start a discussion
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file proteus_analyzer-0.1.0.tar.gz.
File metadata
- Download URL: proteus_analyzer-0.1.0.tar.gz
- Upload date:
- Size: 173.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5cf36923b3efcf4149e4778f55e1424f8cb24dd52ca575502deaf5c46b28ac0
|
|
| MD5 |
0d10317539849b5b7df4629b5ad9d507
|
|
| BLAKE2b-256 |
d1aaddccfba3e76a4376a3724e63d559ef5e66342c11ad083efc236633ea5b88
|
File details
Details for the file proteus_analyzer-0.1.0-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: proteus_analyzer-0.1.0-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 234.5 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c46509f483ba6e1fd568f1be621e0a14dd700b17f2baffd7caa1177707335e0
|
|
| MD5 |
41742f4ac59d487fec73a2baee0b413b
|
|
| BLAKE2b-256 |
ffa471d6f979964e78bbaaf7420172b1c61b87f607b50a042a98bcc505691ad8
|