Print

Print


See my notes and agreement between SLAC networking, security, ATLAS, perfSONAR and ESnet   at meetings in Salt Lake City

-----Original Message-----
From: [log in to unmask] [mailto:[log in to unmask]] On Behalf Of Wei Yang
Sent: Thursday, February 04, 2010 2:26 PM
To: atlas-sccs-planning-l
Subject: Minutes of ATLAS-SCCS planning meeting Feb. 3, 2010

Attendees: Richard, Booker, Lance, Wei

<snip>

Network Monitoring:

Networking group is discussing with LBNL and others on perfSonar
installation at SLAC, a requirement from US ATLAS. Looking at separating the
services that need web-100 kernel (NDT, etc.) from others that only need
regular redhat kernel.

<snip>

Here for the record are my notes from today's hallway meetings. Please provide comments, corrections, etc so we can ensure we agree on the way forward.

SLAC USATLAS Tier 2 perfSONAR Hosts

A requirement from USATLAS is that Tier 2 sites run perfSONAR. This provides network monitoring and diagnosis. A standard hardware platform is used. The distribution is prepared by the Internet2 perfSONAR consortium of which SLAC is a member. It is based on on a recently released production release of Linux (2.6.27.x) from Knoppix and distributed by CD. The use of standard hardware and a single distribution results in a uniform platform. It includes bwctl, owamp, pingER plus on demand tools such as NDT and a traceroute/ping server. The NDT server uses the Web-100 modified TCP kernel. Web-100 was a research project, headed up by Matt Mathis that developed an instrumented TCP kernel enabling measurement and reporting of the internal performance of the kernel. It is currently only supported by a single person. Turning it into a more maintainable patch, using say System Tap (emulates dtrace but for networking) is a major effort.

So far this has worked at all other tier2 sites. Apart from SLAC all of these are university sites. SLAC security has raised concerns that the platform has to be patched soon (2 to 4 weeks) after an alert (CVE) concerning security is raised. Besides keeping current with CVEs and identifying which impact security. Given the research nature of the kernel modifications, identifying where patches need to be applied, and applying them is a non-trivial task. 

FNAL's perfSONAR hosts are dual homed. One link is in the DNZ, the other in the CMS cluster. They are running the latest Knoppix CD distribution from last Autumn. They have not run into security concerns.

BNL have inside border, they do have a firewall. They install new releases as they come available from I2. They have not had to shut down due to lack of patches.

We had a lunch time meeting with Eli, Brian, Gary and Shawn. We agreed that not all CVEs are relevant (e.g. those for ISDN or AppleTalk). The suggestion that NDT be declared an appliance does not work for SLAC since it would need to go behind a firewall and thus not be representative of production network performance.  

The idea of back-porting to the RedHat stable version of Linux 2.16.18 was not viewed favorably since it has a CUBIC bug.  

It was agreed that in the ATLAS community the real value of perfSONAR is in bwctl/owamp/pingER for making regular tests of real transfer rates between sites. NDT has only really been used about 3 times in the last 6 months.

We agreed to split the function and go with the 3 host model: first for bwctl high performance testing, the second for response time testing (owamp & pingER), the third for NDT. The first two will run on top of RedHat, the third will run from the CD.and be turned off when it is not up to date. Though the CD is easy to install/configure since it has a web interface, installing/configuring perfSONAR bwctl, owamp, PingER does not have such a tool and will be harder. Shawn will provide a sample of his install/configuration. 

We (Shawn, Gary, Les) agreed try and meet with Jason and Aaron to bring them up to speed and to see if there is interest in preparing a RedHat/CENTOS distribution for just bwctl/owamp/PingER

We had a 4:30pm meeting with Brian, Jason, Aaron, Jeff Boote and Shawn, and Gary joined us towards the end. We went over some of the issues discussed previously and filled in some more details. Jason and Jeff filled us in on some of the plans to move to Live CD which is based on CENTOS/RedHat. This may happen in the latter part of 2010. It will make the installation of a non Web-100 perfSONAR easier. In the interim SLAC will set up two RedHat boxes apply the perfSONAR RPMs. Shawn will configure two of his hosts, one for the perfSONAR hi speed transfer (bwctl), one for response time (owamp).  From these he will provide configuration information to SLAC. SLAC (Fahad/Yee). If there are problems send emails to the perfSONAR email list. Jason is also on the USATLAS list. Jason, Aaron and Shawn have agreed to assist as possible. SLAC will also look at adding PingER, Yee probably knows best how to accomplish this.