Ail Training
Ail Training
Alexandre Dulaunoy
[email protected]
Sami Mokaddem
[email protected]
Aurélien Thirion
[email protected]
@SUNET 20180207
Objectives of the workshop
2 of 70
Our objectives of the workshop
3 of 70
Sources of leaks
4 of 70
Sources of leaks: Paste monitoring
• Example: http://pastebin.com/
◦ Easily storing and sharing text online
◦ Used by programmers and legitimate users
→ Source code & information about configurations
5 of 70
Sources of leaks: Paste monitoring
• Example: http://pastebin.com/
◦ Easily storing and sharing text online
◦ Used by programmers and legitimate users
→ Source code & information about configurations
• Abused by attackers to store:
◦ List of vulnerable/compromised sites
◦ Software vulnerabilities (e.g. exploits)
◦ Database dumps
→ User data
→ Credentials
→ Credit card details
◦ More and more ...
5 of 70
Examples of pastes
Sources of leaks: Others
• Mistakes from users
◦ https://github.com/search?q=remove password&type=Commits&ref=searchresults
7 of 70
Sources of leaks: Others
• Mistakes from users
◦ https://github.com/search?q=remove password&type=Commits&ref=searchresults
8 of 70
Why so many leaks?
9 of 70
Are leaks frequent?
Yes!
and we have to deal with this as a CSIRT.
• Contacting companies or organisations who did specific
accidental leaks
• Discussing with media about specific case of leaks and how to
make it more practical/factual for everyone
• Evaluating the economical market for cyber criminals (e.g. DDoS
booters1 or reselling personal information - reality versus media
coverage)
• Analysing collateral effects of malware, software vulnerabilities or
exfiltration
→ And it’s important to detect them automatically.
1
10 ofhttps://github.com/D4-project/
70
Paste monitoring at CIRCL: Statistics
2
http://www.circl.lu/pub/tr-46
11 of 70
Privacy, AIL and GDPR
• Many modules in AIL can process personal data and even special
categories of data as defined in GDPR (Art. 9).
• The data controller is often the operator of the AIL framework
(limited to the organisation) and has to define legal grounds for
processing personal data.
• To help users of AIL framework, a document is available which
describe points of AIL in regards to the regulation3 .
3
https:
//www.circl.lu/assets/files/information-leaks-analysis-and-gdpr.pdf
12 of 70
Potential legal grounds
13 of 70
AIL Framework
14 of 70
From a requirement to a solution: AIL Framework
History:
• AIL initially started as an internship project (2014) to
evaluate the feasibility to automate the analysis of
(un)structured information to find leaks.
• In 2018, AIL framework is an open source software in
Python. The software is actively used (and maintained) by
CIRCL.
15 of 70
AIL Framework: A framework for Analysis of
Information Leaks
Other leaks
16 of 70
AIL Framework: Current capabilities
17 of 70
AIL Framework: Current features
• Extracting credit cards numbers, credentials, phone numbers,
...
• Extracting and validating potential hostnames
• Keeps track of duplicates
• Submission to threat sharing and incident response platform
(MISP and TheHive)
• Full-text indexer to index unstructured information
• Tagging for classification and searches
• Terms, sets and regex tracking and occurences
• Archives, files and raw submission from the UI
• Sentiment/Mood analyser for incoming data
• And many more
18 of 70
Live demo!
19 of 70
Example: Following a notification (0) - Dashboard
20 of 70
Example: Following a notification (1) - Searching
21 of 70
Example: Following a notification (2) - Metadata
22 of 70
Example: Following a notification (3) - Browsing
content
23 of 70
Example: Following a notification (3) - Browsing
content
24 of 70
Setting up the framework
25 of 70
Setting up AIL-Framework from source or virtual
machine
Setting up AIL-Framework from source
27 of 70
AIL ecosystem: Technologies used
28 of 70
AIL global architecture 1/2
Flask web interface
Redis
ZMQ feed
ARDB (RocksDB)
import dir.py ZMQ
Via GUI
AIL framework
Credentials credit-cards
ZMQ
AIL Mixer
Redis PubSub
Flask server
30 of 70
Data feeder: Gathering pastes with pystemon
Pystemon global architecture
Redis PubSub 1: port 6380, channel queuing
Redis PubSub 2: port 6380, channel script
Pystemon3
AIL Subscriber
31 of 70
AIL global architecture: Data streaming between
module
32 of 70
AIL global architecture: Data streaming between
module (Credential example)
33 of 70
Message consuming
Modulex
SADD
Redis set
SPOP SPOP
Moduley Moduley
...
Docker container Docker container
Splash Splash
AIL-framework
35 of 70
Figure: Architecture of AIL and its hidden services crawler
Starting the framework
36 of 70
Running your own instance from source
37 of 70
Running your own instance using the virtual machine
38 of 70
Feeding the framework
39 of 70
Feeding AIL
40 of 70
Feeding AIL
41 of 70
Plug-in AIL to the CIRCL feed
You can freely access the CIRCL feed during this workshop!
• In the file bin/package/config.cfg,
• Set ZMQ Global->address to tcp://crf.circl.lu:5556
42 of 70
Via the UI (1)
43 of 70
Via the UI (2)
44 of 70
Feeding AIL with your own data - import dir.py (1)
/!\ 2 requirements:
45 of 70
Feeding AIL with your own data - import dir.py (2)
46 of 70
Feeding AIL with your own data - import dir.py (2)
46 of 70
Feeding AIL with your own data - import dir.py (2)
46 of 70
Creating new features
47 of 70
Developping new features: Plug-in a module in the
system
Choose where to put your module in the data flow:
50 of 70
Case study: Push alert to MISP
51 of 70
Push alert to MISP
−→
52 of 70
Push alert to MISP
−→
4
https://www.misp-project.org/taxonomies.html
53 of 70
Case study: Finding the best place in the system
54 of 70
Case study: Finding the best place in the system
55 of 70
Case study: Updating Flask server.py
Flask server.py
1 [...]
2 # ========== INITIAL tags auto export ============
3 r_serv_db = redis . StrictRedis (
4 host = cfg . get ( " ARDB_DB " , " host " ) ,
5 port = cfg . getint ( " ARDB_DB " , " port " ) ,
6 db = cfg . getint ( " ARDB_DB " , " db " ) ,
7 d e c o d e_responses = True )
8 infoleak_tags = taxonomies . get ( ’ infoleak ’) . machinetags ()
9 i n f o l e a k _ a u t o m a t i c _ t a g s = []
10 for tag in taxonomies . get ( ’ infoleak ’) . machinetags () :
11 if tag . split ( ’= ’) [0][:] == ’ infoleak : automatic - detection ’:
12 r_serv_db . sadd ( ’ l i s t _ e x p o r t _t a g s ’ , tag )
13
14 r_serv_db . sadd ( ’ list_exp o r t _ t ag s ’ , ’ infoleak : submission =" manual " ’)
15 r_serv_db . sadd ( ’ list_exp o r t _ t ag s ’ , ’ < your_tag > ’)
16
56 of 70
Auto Push Tags
57 of 70
Create an event
58 of 70
Create an event
59 of 70
Practical part
60 of 70
Practical part: Pick your choice
61 of 70
Contribution rules
62 of 70
How to contribute
63 of 70
Glimpse of contributed features
• Docker
• Ansible
• Email alerting
• SQL injection detection
• Phone number detection
64 of 70
How to contribute
• Feel free to fork the code, play with it, make some patches or add
additional analysis modules.
65 of 70
How to contribute
• Feel free to fork the code, play with it, make some patches or add
additional analysis modules.
• Feel free to make a pull request for your contribution
65 of 70
How to contribute
• Feel free to fork the code, play with it, make some patches or add
additional analysis modules.
• Feel free to make a pull request for your contribution
• That’s it!
65 of 70
Final words
66 of 70
Annexes
67 of 70
Managing the framework
68 of 70
Managing AIL: Old fashion way
Access the script screen
1 screen -r Script
Shortcut Action
C-a d detach screen
C-a c Create new window
C-a n next window screen
C-a p previous window screen
69 of 70
Managing your modules: Using the helper
70 of 70