dissension: Notes from DrupalCon - Keeping the lights on (operations and monitoring best practices)

The following are my notes from Keeping the lights on - operations and monitoring best practices on Wednesday, March 21st, 2012 at DrupalCon Denver.

“Measurement is the link between mathematics and science” - Brian Ellis, Cambridge, 1968

Primary topics

Platform management, monitoring, and measurement
Security testing and monitoring

Monitoring - mean time to recovery is a key metric (how long does it take to fix)

Ongoing operational security

Essential Monitoring Features

Real-time AND trend monitoring

Infrastructure based

Custom plugin system

Avoid proprietary languages to ensure anyone can contribute

Runs your functional tests
Active AND passive monitoring

Push alerts

Log analysis
Escalation

Quality of life - levels, rotations

Remote command/”job” execution

Functional tests

Use Selenium

Business metrics

PageRank
Things that are relative to the business
Number of users

Technical monitoring

Apc tool

Service state

Cron - execute from remote monitoring system like Nagios

Nagios Module

http://drupal.org/project/nagios

Job Automation

Jenkins is the defacto standard for continuous integration and deployment

Codify and scripting all deployment activities

Logging

Turn on syslog logging - instead of database, write to a text file

Centralized off-server

Monitoring Overview

Ping or HTTP result code alert monitoring || Live user story testing and trend analysis

Crontabs and poormanscron || centralized cron management

Logging to database only || Syslog logging to central host

Logging in to see Drupal errors and available updates || Centralized Drupal monitoring

Offsite backups || Off-cloud backups

Book recommendation

The Visible Ops Handbook

Security Testing and Monitoring

Tools and services to detect and respond to vulnerabilities and threats.

Detect

Finding the problem

Respond

Mitigate, fix, alert

Having a response plan before incidents occur

Vulnerabilities

Weaknesses

Threats

Ways to attack, whether or not they are succesful

Vulnerabilities (OAuth Top 10)

Injection

XSS - biggest problem in Drupal

Broken auth/session - using core? OK

Insecure direct object reference - manging access

CSRF

MIsconfiguration

Insecure cryptographic storage - site specific, SSH, using a VPN to encrypt traffic

Exception - password hash, encrypted information within site and database (encryption module)

Failure to restrict URL access

Insufficient transport layer protection - https

Unvalidated redirects and forwards

Detecting Vulnerabilities

Automated code reviews

Static: Coder Module Secure Code Review module, Acquia

Dynamic: Not common

Automated penetration testing

Generic tools: Grendelscan (open sourcE), Fortify, Rational

Drupal Tools: Acquia

Manual code reviews

db_query(“DELETE FROM {users} WHERE name = “ $name”);

Manual penetration testing

Be an intelligent robot

Vuln.module (NEEDS PORT TO DRUPAL 7), Firefox: Tamperdata

Security review module

Responding to Vulnerabilities

Custom code:

Fix it

Test it

Deploy it

Contact customers (?)

Contributed Code

4 steps above

Work out a simple, repeatable test case

Report the issue to the Drupal Secuyrity Team

Compare to http://drupal.org/security-advisory-poicy

Work with the Team and maintrainer to get a fix

something else???

Detecting threats

Spam

Can be obvious indicator, but only if you’re actually monitoring

Defacement (can be hidden)

Use version control, Hacked! module

security_review.module

Watch revisions

Crowdsource (flag)

Code injection (xss, php)

IDS - PHPIDS, TinyIDS

http://www.codetrax.org/projects/tinyids

https://phpids.org/

Web Application Firewall

mod_security from apache

Brute force password

Read watchdog all day long

Droptor http://drupal.org/project/droptor

Responding to threats

Spam

Mollom, Akismet

Spam, flag_abuse

Defacement

Revert to good copies from version control

Overwrite with new versions

Node revisions, db backup

Code injection

Keep code safe

Proactively block attackers at the firewall

Brute force password

login_security module

Included in Drupal 7 core

Help with everything: httpBL

http://drupal.org/project/httpbl

honeypot

Reduces crawlers and malicious spiders

Site monitoring

Internal/Free

Views

Mailmon - brand new

Quant - charting

Report - charting

Chart (system_charts)

External/Paid

Acquia network - ~$350/year, includes library, support

Acquia Insight

Droptor - $24/month/site, monitoring only

Drupalmonitor.com - unknown pricing

Three keys to ongoing operational security

Vigilance

Strong Chain

Incident Handling

What are the things that we need to do after launch on an ongoing basis after launch?

Maintain eternal vigilance

Automate as much as possible

Avoiding human error - often “I was too busy to get to it”

Conduct periodic audits

Never sleep

Periodic Audit Program

Email Trent Hein of AppliedTrust for a copy of the PDF

http://www.appliedtrust.com/trent

Avoiding weak links in the chain

Education

Training

Awareness

Patching

PCI DSS requires patching of all critical infrastructure within 30 days

What:

Linux or other underlying OS

Firewall infrastructure

Switches

Wireless Access Points

… more

Incident Management (needs to be written)

Initial Response

Notification and Escalation

Smallest possible group for as long as possible, then figure out communication

Response Strategy

Do we need to update? Notify users?

One important take-away

Don’t use the same password on multiple sites you administer (Playstation Network)

Secure Site Admin Pledge

I pledge to take the following steps to be a responsible Drupal site administrator:

I have set a unique, strong password for any accounts with administrative privelegaes, and I do not share passwords across sites

I use multi-factor authentications (e.g., ssh keys) for OS-level access and have password-only access disabled on my systems.

I have and execute a patching plan that includes the OS, web server, and Drupal layers (including core, modules, and custom code)

I have and execute at least a minimalist periodic audit plan

I am aware of and comply with applicable information security requirements for the data that my site handles (HIPAA, PCI DSS, etc.)

I monitor vulnerability announcement mailing lists for the technologies I use on my site

I monitor my system regularly such that I know how it behaves under normal conditions

I have a documented incident handling plan that I am familiar with and can use in an emergency

I take responsibility for ensuring that any custom code is developed according to secure coding best practices and is evaluated before being put into production

I will be eternally vigilant and investigate any unusual/suspicious site behavior

I have a process in place to ensure non-production sites are appropriately protected from external/access /crawling

I am an advocate for practical information security practices and avoid “Security theater” showmanship

Thank You!
Please get in touch to chat about these topics:

Greg: greg.knaddison@acquia.com

Ned: ned@appliedtrust.com

Trent: trent@appliedtrust.com

dissension

Wednesday, March 21, 2012

Notes from DrupalCon - Keeping the lights on (operations and monitoring best practices)

Primary topics

Essential Monitoring Features

Functional tests

Business metrics

Nagios Module

Job Automation

Logging

Monitoring Overview

Book recommendation

Security Testing and Monitoring

Detect

Respond

Vulnerabilities

Threats

Vulnerabilities (OAuth Top 10)

Detecting Vulnerabilities

Responding to Vulnerabilities

Custom code:

Contributed Code

Detecting threats

Responding to threats

Site monitoring

Three keys to ongoing operational security

What are the things that we need to do after launch on an ongoing basis after launch?

Periodic Audit Program

Avoiding weak links in the chain

Patching

Incident Management (needs to be written)

One important take-away

Secure Site Admin Pledge

Thank You!

No comments: