Planet Sysadmin               

          blogs for sysadmins, chosen by sysadmins...
(Click here for multi-language)

May 23, 2013

bc-log

Packing and Unpacking files with GNU Tar

One of the most basic tasks for any Sysadmin is packing and unpacking files for various reasons. While there are many ways to perform this task GNU Tar is probably one of the most recognized and commonly used tools by Linux/Unix users.

A little history on tar

The tar command is a command that appeared in the early days of Unix and has had several changes made over time. Originally the command was used to take files, combine them into one file and write them to a tape archive (tar). Nowadays tar is used mostly as a general purpose tool to package and compress many files into one single file for distribution or backup.

There are several common implementations of tar that are in use today, because there are multiple implementations there are also some differences in the options and formats available. In today’s article I will not be showing all of the various options of tar (that’s what man pages are for), but rather will be showing commonly used flags and some not so common tricks.

Tar Basics

Creating a tar file

To create a basic tar you really only need to specify a few things.

  • -c Stands for create, you will see this a lot in our examples today
  • -f or –file immediately followed by a file or device will tell tar where to create the tar file
  • And finally the files or directories to package
$ tar -cf tarfile.tar file1.txt

 Extracting a tar file

Extracting a tar file is just as simple as creating one.

  • -x Stands for extract
  • -f or –file immediately followed by a file or device has the same usage as create
$ tar -xf tarfile.tar

Adding verbosity

By default tar does not output what it is doing, you can add this by adding verbosity to the command with the -v flag. In addition to adding verbosity we are also going to tar more than one file in our example. Packaging more than one file is the point of tar after all isn’t it?

$ tar -cvf tarfile.tar files_dir/ file1.txt
files_dir/
files_dir/file3.txt
files_dir/file4.txt
file1.txt

As you can see packaging an entire directory is as simple as adding it to the list of files to package into a tar file.

Listing files in an existing tar

Sometimes you simply want to look at the files within a tar file without extracting, to do so we can use the -t or –list flag. As a side note it is generally a good practice when you receive a tar file from an outside source to list the contents of the tarball to ensure you are not overwriting files you do not intend to.

$ tar -tf tarfile.tar
files_dir/
files_dir/file3.txt
files_dir/file4.txt
file1.txt

We can also add the verbose option and show the files attributes such as permissions, size and timestamps.

$ tar -tvf tarfile.tar
drwxrwxr-x madflojo/madflojo 0 2013-05-22 21:00 files_dir/
-rw-rw-r-- madflojo/madflojo 0 2013-05-22 21:00 files_dir/file3.txt
-rw-rw-r-- madflojo/madflojo 0 2013-05-22 21:00 files_dir/file4.txt
-rw-rw-r-- madflojo/madflojo 0 2013-05-22 20:42 file1.txt

An important note on tar is that it has the ability of retaining file attributes such as permissions, size and timestamps. When extracted as a user with proper privileges these attributes will be applied to the newly created files or overwritten files.

Appending files to an existing tar

Once a tar file is created it is possible to add files with the -r or –append option. The append option however is not allowed when the file had been compressed.

$ tar -rvf tarfile.tar file2.txt
 file2.txt

Adding gzip compression

Early versions of tar used Unix compress for file compression, after some time gzip compression was also added.

The old way

Some systems had implemented the gzip command but not a tar command that added gzip inherently. Originally if users wanted to create a tarball that was gzip compressed they would need to tar the file and then gzip it.

$ tar -cvf tarfile.tar file1.txt file2.txt
file1.txt
file2.txt
$ gzip tarfile.tar
$ ls -la tarfile.tar.gz
-rw-rw-r-- 1 madflojo madflojo 136 May 22 21:22 tarfile.tar.gz
The new way

Modern implementations of tar add gzip compression inherently; you can add this compression at the creation of the tar file with the -z or –gzip option.

$ tar -cvzf tarfile.tar.gz file1.txt file2.txt
file1.txt
file2.txt

 Adding bzip2 compression

bzip2 is a compression tool much like gzip however it uses a different algorithm to compress files and is generally better at compression however it takes longer to compress items. To add bzip2 compression we simply add a -j to the command.

$ tar -cjvf tarfile.tar.bz files_dir
files_dir/
files_dir/file3.txt
files_dir/file4.txt

Extracting tarballs with compression

Any-time you are dealing with tarfiles that have been compressed you will need to add the appropriate compression flag to other tar commands such as extract or list. The following is an example of extracting a bzip2 file.

$ tar -xjvf tarfile.tar.bz files_dir
files_dir/
files_dir/file3.txt
files_dir/file4.txt

Listing tarballs with compression

The following is an example of listing a tar files contents that has gzip compression.

$ tar -cjvf tarfile.tar.bz files_dir
files_dir/
files_dir/file3.txt
files_dir/file4.txt

Extract without replacing old files

The tar commands on today’s systems have the ability to extract files without overwriting and existing file. To enable this you will need to specify -k on the extract command.

$ tar -czf tarfile.tar.gz file1.txt file2.txt
$ rm file2.txt && echo "I removed file2" >> file1.txt
$ tar -xvzkf tarfile.tar.gz
file1.txt
file2.txt
$ cat file1.txt
I removed file2

Beyond the basic tar commands

Creating a tar with –files-from  to avoid argument list too long

Sometimes specifying the files for tar to package is difficult. Either due to the number of files, the names of files or simply because it is too much to type. Tar has the ability to read a file and create a tarball of the files listed within the input file.

Below is an example of one way to get around the argument list too long problem.

 The problem:

$ tar -czf ../tarfile.tgz *
bash: /bin/tar: Argument list too long

Solution:

$ ls > ../filestocopy.txt
$ tar -T ../filestocopy.txt -czf ../tarfile.tgz

In addition to the argument list too long scenario the -T flag can be useful for automated jobs that may need to run tar against many files.

Tarpipe (or TarCopy)

Tarpipe or sometimes refereed to as tarcopy is the process where one would use tar to copy files from one place to another.

The idea behind tarpipe is that tar has the ability to send the packaged files to stdout rather than to a file. When you use this you can pipe that stdout to another tar command in a different directory.

$ tar -cf - file* | (cd ../files_copied/ && tar -xf -)

The – after -f where a file name would normally go is what tells tar to send the output to standard out.

Why use tar and not cp?

Originally the cp command did not support preserving timestamps and file permissions and that was one of the major reasons to use tarpipe rather than cp. However times have changed and modern-day cp commands do have the -p (preserve) option, but there is still one reason to use tarpipe over cp. It’s Faster!

$ time tar -cf - file* | (cd ../files_copied/ && tar -xf -)

real    0m0.010s
user    0m0.004s
sys    0m0.004s
$ time cp -p file* ../files_copied/

real    0m0.024s
user    0m0.000s
sys    0m0.000s

While .006s does not seem like a long time the above command only copied 2 files. If these files are large in size or if we start talking about millions of files, that .006s starts adding up.

Using tarpipe to copy files to a remote system

Sometimes you may need to copy files from one system to another retaining permissions and timestamps. Luckily tarpipe isn’t only limited to local system copies, you can also use it to copy to remote systems through SSH. While on most modern systems its probably better/faster to use rsync, if you are supporting an older OS that doesn’t have rsync this could save you sometime.

$ tar -cf - file* | ssh remote-server "(cd /files_copied/ && tar -xf -)"

 

Tags: , , , , ,

by Benjamin Cane at May 23, 2013 03:07 PM

Everything Sysadmin

First Google Ganeti Conference: GanetiCon 2013

Synnefo has announced the first Google Ganeti Conference: GanetiCon 2013.  They will be co-organizers.  The announcement was first made on the Synnefo blog.

The conference will take place between 3-5 September 2013 in Athens, Greece. The venue and program will be announced soon. Most developers of the Ganeti and Synnefo team will be attending.

The first GanetiCon will be a developer oriented conference. Sessions will be a mix of design talks and discussions about new features and future plans. It will also probably feature an advanced Ganeti workshop, depending on user demand.

The conference is geared towards people interested in:

  • learning how other companies/institutions use Ganeti
  • checking out how large scale Ganeti deployments look like
  • glimpsing the product roadmap of Ganeti
  • contributing to future design of Ganeti
  • obtaining help with specific Ganeti issues

The organizers do not yet have a website. To be kept informed of information as it is available please fill out this form.

I wish them the best of luck!  It sounds like a great conference!

May 23, 2013 03:00 PM

Yellow Bricks

Number of vSphere HA heartbeat datastores less than 2 error, while having more?


Last week on twitter someone mentioned he received the error that he had less than two vSphere HA heartbeat datastores configured. I wrote an article about this error a while back so I asked him if he had two or more. This was the case, so next thing to do was to “reconfigure for HA” to clear the message hopefully.

The number of vSphere HA heartbeat datastores for this host is 1 which is less than required 2

Unfortunately after reconfiguring for HA the error was still there, next suggestion was looking at the “heartbeat datastore” section in HA. For whatever reason HA was configured to “Select only from my preferred datastores” and no datastores were selected just like in the screenshot below. HA does not override this so when configured like this NO heartbeat datastores are used, resulting in this error within vCenter. Luckily the fix is easy, just set it to “Select any of the cluster datastores”.

the number of heartbeat datastores for host is 1

the number of heartbeat datastores for host is 1

"Number of vSphere HA heartbeat datastores less than 2 error, while having more?" originally appeared on Yellow-Bricks.com. Follow me on twitter - @DuncanYB.

by Duncan Epping at May 23, 2013 10:58 AM

Debian Admin

How to Install Cinnamon 1.8 on debian 7.0 (Wheezy)

Cinnamon is a user interface. It is a fork of GNOME Shell, initially developed by (and for) Linux Mint. It attempts to provide a more traditional user environment based on the desktop metaphor, like GNOME 2. Cinnamon uses Muffin, a fork of the GNOME 3 window manager Mutter, as its window manager from Cinnamon 1.2 onwards

What is new in version 1.8

File Manager

Nemo received a lot of attention. Its user interface was heavily modified and its behavior was adapted to integrate better with Cinnamon.You can now easily hide the sidebar and switch back and forth between places and treeview. Under each place, if applicable, a small bar indicates how much space is used.

Screensaver

Cinnamon now features its own screensaver. One of its particularities is that you can define an away message before locking up your screen.

Control Center

All configuration modules are now present in Cinnamon Settings. You no longer need to use Gnome Control Center.

Desklets

KDE calls them Plasmoids, Android calls them Widgets, in Cinnamon they’re called “Desklets”. The same way you can add applets to your panel, you can add desklets to your desktop.

Spices Management

In Cinnamon 1.8 you can install “spices” (i.e. applets, desklets, themes, extensions) straight from your desktop.

New features for developers

Settings API for Applets/Desklets

If you’re an Applet/Desklet developer, don’t use gsettings anymore. Cinnamon 1.8 features a settings API which will do all the work for you.

It will set up your settings and default values for you, automatically.
It will allow you to access your settings just as easily as you access values in an array.
It will generate a configuration screen for you, automatically.

Install Install Cinnamon 1.8 on Debian 7

Open the terminal and run the following commands

$ su -
# echo deb http://packages.linuxmint.com/ debian main import backport upstream romeo >> /etc/apt/sources.list
# apt-get install linuxmint-keyring
# apt-get update
# apt-get install cinnamon

Screenshot

1

2

by ruchi at May 23, 2013 10:18 AM

Google Blog

Capturing the beauty and wonder of the Galapagos on Google Maps

The Galapagos Islands are some of the most biologically unique ecosystems in the world. Explorers and scientists alike have long studied and marveled at these islands—made famous by Charles Darwin. The Ecuadorean Government, local conservation groups and scientists are working to protect the Galapagos from threats posed by invasive species, climate change and other human impacts.

It’s critical that we share images with the world of this place in order to continue to study and preserve the islands’ unique biodiversity. Today we’re honored to announce, in partnership with Charles Darwin Foundation (CDF) and the Galapagos National Parks Directorate (GNPD), that we’ve collected panoramic imagery of the islands with the Street View Trekker. These stunning images will be available on Google Maps later this year so people around the world can experience this remote archipelago.

Daniel Orellana of Charles Darwin Foundation crossing a field of ferns to reach Minas de Azufre (naturally-occurring sulfur mines) on the top of Sierra Negra, an active volcano on Isabela Island. The Google Maps team traveled for more than three hours, hiking and on horseback, to reach this remote location.

Images, like the one you see above, are also an important visual record that the CDF and GNPD will use to study and protect the islands by showing the world how these delicate environments have changed over time.

Daniel Orellana of the Charles Darwin Foundation climbs out of a lava tunnel where he was collecting imagery. The dramatic lava landscapes found on Isabela island tell the story of the formation of the Galapagos Islands.

Our 10-day adventure in the Galapagos was full of hiking, boating and diving around the islands (in hot and humid conditions) to capture 360-degree images of the unique wildlife and geological features of the islands with the Trekker. We captured imagery from 10 locations that were hand-selected by CDF and GNPD. We walked past giant tortoises and blue-footed boobies, navigated through steep trails and lava fields, and picked our way down the crater of an active volcano called Sierra Negra.


A Galapagos giant tortoise crawls along the path near Googler Karin Tuxen-Bettman while she collects imagery with the Street View Trekker in Galapaguera, a tortoise breeding center, which is managed by the Galapagos National Park Service.

Life underwater in the Galapagos is just as diverse as life on land. We knew our map of the islands wouldn’t be comprehensive without exploring the ocean that surrounds them. So for the second time we teamed up with the folks at the Catlin Seaview Survey to collect underwater panoramic imagery of areas being studied by CDF and GNPD. This imagery will be used by Catlin Seaview Survey to create a visual and scientific baseline record of the marine environment surrounding the islands, allowing for any future changes to be measured and evaluated by scientists around the world.

Christophe Bailhache navigates the SVII camera through a large group of Sea Lions at Champion Island in Galapagos. Image courtesy of the Catlin Seaview Survey.

We truly believe that in order to protect these Galapagos Islands, we must understand them. As they say, “a picture is worth a thousand words.” We hope this Street View imagery not only advances the important scientific research, but also inspires you to learn more about this special place. Stay tuned for updates on this collection—the first time we’ve captured imagery from both land and sea! We can’t wait to share this amazing imagery with you later this year.

by Emily Wood (noreply@blogger.com) at May 23, 2013 10:00 AM

Aaron Johnson

Chris Siebenmann

Why web robots sending Referer headers is wrong

I've written before on my view that web robots of all sorts should never send a Referer header. In those entries I mostly said 'don't do that' without giving a solid philosophical argument about why, so today I feel like changing that.

(Not that a philosophical argument actually matters. Proper behavior on the web is defined by social convention, ie by what lots of other people do and expect, not by arguing with people over what makes sense. Whether or not you agree with a social convention you break it at your peril, and today robots not sending Referer headers is a well established social convention that I will ban you for violating. And anyways the people who should read this never will.)

There are two philosophical reasons why it's wrong for robots to send Referer headers. The first is inherent in what the Referer header means, namely 'I just followed a link from page <X>'. This is a description of human behavior but not really of robot behavior; almost no web robot actually traverses the web in that way, finding links and immediately following them. If you crawl web pages, accumulate links, and then some time later crawl those links, you are not 'following a link' in any conventional sense. Worse, what happens if you discover the same link through multiple source documents? Which document gets 'credit' and appears in Referer?

(Yes, yes, this is not quite the spec definition, which kind of permits the 'I found it here' meaning that robots sometimes use. It is instead the practical definition of the header, as defined by how most everything behaves.)

So, you say, you don't care; you want to use Referer as a kind of 'this is what links to you' field for servers. I can summarize a bunch of problems here by saying that the Referer field is a terrible way to communicate this information to web operators, fundamentally because you are trying to use a side effect of HTTP requests to pass on what may be a huge amount of information. If you actually want to be useful you should make this information available on your own web site where people can see and fetch it in bulk.

Finally, the brutal truth is that 'who links to me' is by far less interesting than 'who is sending human traffic to me (right now)'. By far the most valuable part of Referer is information on where real (human) visitors are coming from, to the extent that it's possible to find this out. Being read by people is the ultimate purpose of most web pages, which makes what places are the source of traffic and active links something of decided interest to us. And this sort of human behavior has very little to do with either robot behavior or what potential links exist out there in the world. Mingling either your robot's actions or a 'helpful' attempt to tell us about the latter is not doing us any favours; rather the contrary, in fact (this is one large reason that I react angrily to robots sending Referer).

(There is also the inconvenient fact that once you're operating a decent sized site you're not likely to really care about who links to you because there will be far too many links out there, most of them in increasingly obscure and unimportant places. The links you do care about are exactly the links that send you significant traffic.)

by cks at May 23, 2013 04:26 AM

May 22, 2013

Ubuntu Geek

Yellow Bricks

Is flash the saviour of Software Defined Storage?


I have this search column open on twitter with the term “software defined storage”. One thing that kept popping up in the last couple of days was a tweet from various IBM people around how SDS will change flash. Or let me quote the tweet:

What does software-defined storage mean for the future of #flash?

It is part of a twitter chat scheduled for today, initiated by IBM. It might be just me misreading the tweets or the IBM folks look at SDS and flash in a completely different way than I do. Yes SDS is a nice buzzword these days. I guess with the billion dollar investment in flash IBM has announced they are going all-in with regards to marketing. If you ask me they should have flipped it and the tweet should have stated: “What does flash mean for the future of Software Defined Storage?” Or to make it even sound more marketing is flash the saviour of Software Defined Storage?

Flash is a disruptive technology, and changing the way we architect our datacenters. Not only did it already allow many storage vendors to introduce additional tiers of storage it also allowed them to add an additional layer of caching in their storage devices. Some vendors even created all flash based storage systems offering thousands of IOps (some will claim millions), performance issues are a thing of the past with those devices. On top of that host local flash is the enabler of scale-out virtual storage appliances. Without flash those type of solutions would not be possible, well at least not with a decent performance.

Since a couple of years host side flash is also becoming more common. Especially since several companies jumped in to the huge gap there was and started offering caching solutions for virtualized infrastructures. These solutions allow companies who cannot move to hybrid or all-flash solutions to increase the performance of their virtual infrastructure without changing their storage platform. Basically what these solutions do is make a distinction between “data at rest” and “data in motion”. Data in motion should reside in cache, if configured properly, and data in rest should reside on your array. These solutions once again will change the way we architect our datacenters. They provide a significant performance increase removing many of the performance constraints linked to traditional storage systems; your storage system can once again focus on what it is good at… storing data / capacity / resiliency.

I think I have answered the questions, but for those who have difficulties reading between the lines, how does flash change the future of software defined storage? Flash is the enabler of many new storage devices and solutions. Be it a virtual storage appliance in a converged stack, an all-flash array, or host-side IO accelerators. Through flash new opportunities arise, new options for virtualizing existing (I/O intensive) workloads. With it many new storage solutions were developed from the ground up. Storage solutions that run on standard x86 hardware, storage solutions with tight integration with the various platforms, solutions which offer things like end-to-end QoS capabilities and a multitude of data services. These solutions can change your datacenter strategy; be a part of your software defined storage strategy to take that next step forward in optimizing your operational efficiency.

Although flash is not a must for a software defined storage strategy, I would say that it is here to stay and that it is a driving force behind many software defined storage solutions!

"Is flash the saviour of Software Defined Storage?" originally appeared on Yellow-Bricks.com. Follow me on twitter - @DuncanYB.

by Duncan Epping at May 22, 2013 03:01 PM

Simplehelp

How to Stream Spotify from Your iPhone to Your AppleTV or Boxee

Boxee iPad iPhone

This very brief tutorial will show you how to stream from the Spotify App on your iPhone or iPad to your AppleTV or Boxee Box/PC.

The ability to stream audio from Spotify on your iPhone or iPad to your AppleTV or Boxee device is actually built into the Spotify App – it’s just very buried. Here’s how to find it –

  1. Launch Spotify and start playing a song. Tap the ‘box’ with the Artist/Song information located at the bottom of the Spotify App.
  2. Now tap anywhere on the Album cover/art.
  3. Here you’ll finally find the Apple “Send To” button (as illustrated in the screenshot below). Tap it…
  4. … and select Apple TV
  5. … or the name of your Boxee device.
  6. Now you can listen to Spotify on your home media center.

by Ross McKillop at May 22, 2013 11:45 AM

Google Blog

“Coming Home” by Wisconsin student wins U.S. 2013 Doodle 4 Google competition

After 130,000 submissions and millions of votes cast, Sabrina Brady of Sparta, Wisc. has been named the 2013 U.S. Doodle 4 Google National Winner. Her doodle, “Coming Home,” will be featured on the Google homepage in the U.S. tomorrow, May 23.

Students across all 50 states amazed us with their creative interpretations of this year’s theme, “My Best Day Ever...” From scuba diving to dinosaurs to exploring outer space, we were wowed by the ways young artists brought their best days to life in their doodles.

Sabrina’s doodle stood out in the crowd; it tells the story of her reunion with her father as he returned from an 18 month deployment in Iraq. Her creative use of the Google letters to illustrate this heartfelt moment clearly resonated with voters across the country and all of us at Google.

In addition to seeing her artwork on the Google homepage, Sabrina—who is in 12th grade at Sparta High School—will receive a $30,000 college scholarship, a Chromebook computer and a $50,000 technology grant for her school. She will attend Minneapolis College of Art and Design this coming fall, where she will continue her artistic pursuits. Congratulations Sabrina!


In addition to the National Winner, voters across the country helped us determine the four National Finalists, who will each receive a $5,000 college scholarship:
  • Grades K-3: Reagan Gonsalves (Grade 1, Santan Elementary School, Chandler, Ariz.) for her doodle “My best day ever is learning about nature.” Reagan says, “My best day ever is to be around the pretty animals and plants in nature, because I love to know about what is around me. I love to watch hummingbirds drink nectar out of flowers. I love to read books on nature and how plants and animals grow.”
  • Grades 4-5: Audrey Zhang (Grade 4, Michael F. Stokes Elementary School, Levittown, N.Y.) for her doodle “...When I discover paradise!” Zhang says, “My best day ever will be when I discover paradise. In paradise, I could play with dragons, romp with leopards, and chat with fairies...It would be the best day ever when I could finally live in a mystical, dreamy realm.”
  • Grades 6-7: Maria Iannone (Grade 7, Chestnut Ridge Middle School, Sewell, N.J.) for her doodle “The best day ever.” Maria says, “Where I live, it's difficult to view the night sky very well. Having an interest in astronomy, a day where I can observe the things I study on my own time would satisfy me.”
  • Grades 8-9: Joseph Han (Grade 8, Falmouth Middle School, Falmouth, Maine) for his doodle “Late-afternoon bliss.” Joey says, “For me, ‘the best day ever’ doesn't consist of ambitious dreams, but rather the enjoyment of a day spent in carefree euphoria. Being in the woods is something that evokes such happiness in me. The lighthearted joy of rafting, fishing or catching fireflies is what I've attempted to capture.”

After the awards ceremony, all 50 of our State Winners will unveil a special exhibition of their artwork at the American Museum of Natural History in New York City, where their doodles will be displayed for the public to view from May 22 - July 14.

Thanks to all who voted and helped us select the 2013 Doodle 4 Google winners. Even more importantly, thank you to all of the students who submitted their artwork and the parents and teachers who continue to inspire and support their young artists. Until next year... happy doodling!

by Emily Wood (noreply@blogger.com) at May 22, 2013 10:32 AM

Top Charts in Google Trends—The most searched people, places and things

Ever wonder what the world is searching for? With Google Trends, you can see what's hot right now, and also explore the history and geography of a topic as it evolves. Today you'll find new charts of the most-searched people, places and things in more than 40 categories, from movies to sports teams to tourist attractions. You'll also find a new colorful visualization of real-time Hot Searches.

Top Charts—a new monthly "spirit of the times"
Top Charts are lists of real-world people, places and things ranked by search interest. They show information similar to our Year-End Zeitgeist, but updated monthly and going back to 2004. To check them out, go to Google Trends and click "Top Charts" on the left-hand side. For example, you can see the 10 most-searched cities, movies and scientists in April:

Top Charts includes more than 40 top 10 lists and more than 140 time periods. Hover on a chart for links to embed the chart in your own page or share on social media.

Top Charts is built on the Knowledge Graph, so the data shows interest in real-world things, not just keywords. When you look at a chart of sports teams and you see the Golden State Warriors, those rankings are based on many different related searches, like [gs warriors], [golden state bball] and [warriors basketball]. That way you see which topics are most popular on Google Search, however people search for them. Top Charts provide our most accurate search volume rankings, but no algorithm is perfect, so on rare occasion you may find anomalies in the data. You can learn more about Top Charts in our Help Center.

Hot Searches, now in hot colors
In addition to Top Charts, now there's a vibrant new way to visualize trending searches as they happen. On the Trends homepage in the left-hand panel, you'll find a new link to "Visualize Hot Searches in full-screen." You’ll see the latest trending topics appear in a colorful display:


You can customize the layout by clicking the icon in the upper-left corner and expanding it to see as many as 25 searches at a time. You can also pick any region currently supported by Hot Searches. Use fullscreen mode in your browser for the biggest, purest eye candy.

...and a few design updates
We’re also continuing to spruce up our site. Among other things, now the homepage shows you more interesting stuff up front, and the search box is always available at the top:

The new Trends homepage shows a list of today's Hot Searches. Enter search terms at the top to see search interest over time and by geography.

We hope you enjoy bringing new stories to life with Google Trends. We love feedback, so please feel free to let us know what you think by posting online or by clicking "Send Feedback" at the bottom of any page in Google Trends.

by Emily Wood (noreply@blogger.com) at May 22, 2013 09:02 AM

Chris Siebenmann

Diffbot's bad Referer header

Today a web spider called 'Diffbot' (run by diffbot.com) made a whole bunch of requests here, all of which failed. They failed because, just as it has repeatedly done in the past, it made them all with a Referer header of 'http://news.google.com/' and this behavior long ago led me to ban it entirely from here.

There are a number of things wrong with this header. The first is that, to steal from the old Trix commercials, 'silly robot, the Referer header is for humans'. I've writen about this before at some length and doing it here is generally a good way to get your spider banned.

(I have a philosophical ramble about why this is the correct view, but it's going in another entry.)

The second is that, of course, this Referer value is a flaming lie in two different ways. Diffbot in no way shape or form traveled from news.google.com to the whole collection of URLs here that it attempted to crawl with that Referer header and on top of that, news.google.com does not link to here at all. Diffbot made up the header from whole cloth. I react very badly to web spiders that lie to me at the best of times (even if they aren't spraying junk over my referer logs).

Diffbot and its operators may or may not be legitimate, or at least honest about what they're doing; I have no particular opinions on that. But they are unquestionably operating a web spider that routinely lies. I have no idea why and really, I don't care; I was doing them a favour by letting them crawl me and I can and will withdraw that favour if they irritate me.

(See also my technical requirements for web spiders and my standards for responsible spider behavior.)

(No, I haven't mailed Diffbot's operators about this behavior. Are you kidding? I'm neither crazy nor stupid. On today's Internet, mailing people about issues is for people that you actually trust.)

by cks at May 22, 2013 03:21 AM

May 21, 2013

Everything Sysadmin

TodoPro available for Android (beta)

The todo list program that I use on my iPhone is now available on Android. It is a beta. I've been using the earlier betas on my Android tablet and it is looking very good.

Previously I hadn't found todo list software for Android that worked well for me, and I had tried many. I'd been doing all my time management on my iPhone because TodoPro worked so well for me. I'm very excited that an Android release is now available.

I don't endorse products but I do let people know what I personally use. I think todo list software is very personal... the definition of "best" is "what works for you". So, try it and see if it works for you.

Visit the TodoPro for Android on Google Play Store.

TodoPro offers a sync service. When you have syncing set up you can access the same data from all your devices as well the web-based interface.

Full details are available on the company's website.

May 21, 2013 07:28 PM

my other pc is a cloud

My Entry for the Advanced Event #4 of the 2013 Scripting Games

We're on the downhill stretch now. Honestly I'm kind of glad.  These scripts are fun to write, and great practice, but it's work.  I can tell that I'm not the only one loosing steam, as the number of votes on other people's entries has gone way down.  Anyway, about the script I wrote: I like that the #Requires -Modules statement at the top automatically loads the AD module for you if it's not already loaded. I still didn't do the BEGIN/PROCESS/END blocks this time either, which I fail to see how it matters at all, since I'm not dealing with pipeline input... but I'm sure I'll still get crowd scores of 1 and 2 stars for it.  That and dudes with 640x480 monitors going "some of your code goes off the screen why don't you splat!?"  :P

#Requires -Version 3
#Requires -Modules ActiveDirectory
Function Get-RandomADUser
{
<#
.SYNOPSIS
    Retrieves random users from Active Directory and generates an HTML report.
.DESCRIPTION
    Retrieves random users from Active Directory, generates an HTML report,
    and then returns the users to the pipeline. 
    Use the -Verbose switch if you want to see console output.
    This Cmdlet requires PS 3 and the Active Directory module. The AD module
    will be loaded automatically if it isn't already.
.PARAMETER Count
    The number of random users to get from Active Directory. Minimum is 1,
    maximum is Int16.MaxValue (32767) and the default is 20.
.PARAMETER Filename
    The filename to write the HTML report to. The filename must end in
    html or htm. The default is .\RandomADUsers.html.
.EXAMPLE
    Get-RandomADUser
 
    Gets 20 random users from AD, outputs a report to .\RandomADUsers.html.
.EXAMPLE
    Get-RandomADUser -Count 100 -Filename C:\reports\rpt.html.
 
    Gets 100 random users from AD, outputs a report to C:\reports\rpt.html.
#>
 
    [CmdletBinding()]
    Param([Parameter()]
            [ValidateRange(1, [Int16]::MaxValue)]
            [Int16]$Count = 20,
          [Parameter()]
            [ValidateScript({($_.ToLower().Split('.')[-1] -EQ "html" -OR $_.ToLower().Split('.')[-1] -EQ "htm") -AND (Test-Path -IsValid $_)})]
            [String]$Filename = ".\RandomADUsers.html") 
 
    Try
    {
        Write-Verbose "Retrieving users from Active Directory..."
        $Users = Get-ADUser -Filter * -Properties Department, Title, LastLogonDate, PasswordLastSet, Enabled, LockedOut -ErrorAction Stop | Get-Random -Count $Count
        Write-Verbose "$($Users.Count) users retrieved from Active Directory."
    }
    Catch
    {
        Write-Error "Unable to retrieve users from Active Directory: $($_.Exception.Message)"
        Return
    }   
    Try
    {
        Write-Verbose "Generating report $Filename..."
        $Header = @'
        <title>Random Active Directory User Audit</title>
            <style type=""text/css"">
                <!--
                    TABLE { border-width: 1px; border-style: solid;  border-color: black; }
                    TD    { border-width: 1px; border-style: dotted; border-color: black; }
                -->
            </style>
'@
        $Pre  = "<p><h2>Random Active Directory User Audit for $Env:USERDNSDOMAIN</h2></p>"
        $Post = "<hr><p style=`"font-size: 10px; font-style: italic;`">This report was generated on $(Get-Date)</p>"
        $Users | ConvertTo-HTML -Property SamAccountName, Department, Title, LastLogonDate, PasswordLastSet, Enabled, LockedOut -Head $Header -PreContent $Pre -PostContent $Post | Out-File $Filename     
        Return $Users
    }
    Catch
    {
        Write-Error "Unable to generate report: $($_.Exception.Message)"
    }
}

by ryan@myotherpcisacloud.com at May 21, 2013 05:35 PM

Milek

Setting RPATH

Today I was made aware that elfedit tool in Solaris 11 allows for setting RPATH (among other things). The only caveat is that a binary had to be linked on Solaris 11. It is very easy to use:

 # elfedit -e 'dyn:runpath $ORIGIN/../lib' /opt/bin/myprog 

There is a nice blog entry about it from Ali Bahrami.

by milek (noreply@blogger.com) at May 21, 2013 02:41 PM

Google Blog

Congratulations to the 2013 Google Anita Borg Memorial Scholars

Dr. Anita Borg revolutionized the way we think about technology and worked to dismantle the barriers that keep women and minorities from entering the computing and technology fields. In her lifetime, Anita founded the Institute for Women and Technology (now The Anita Borg Institute for Women and Technology), began an online community called Systers for technical women, and co-founded the Grace Hopper Celebration of Women in Computing. We’re proud to honor her memory through the Google Anita Borg Memorial Scholarship, established in 2004.

Today we’d like to recognize and congratulate the 30 Google Anita Borg Memorial scholars and the 30 Google Anita Borg Memorial finalists for 2013. The scholars, who attend universities in the United States and Canada, will join the annual Google Scholars’ Retreat this summer in New York City, where they will have the opportunity to attend tech talks on Google products, network with other scholars and Googlers, participate in developmental activities and sessions, and attend social activities. This year, the scholars will also have the opportunity to participate in a scholars’ edition of 24HoursOfGood, a hackathon in partnership with local non-profit organizations who work on education and STEM initiatives to make progress against a technical problem that is critical to their organization’s success.

Find out more (PDF) about our winners, including the institutions they attend. Soon we’ll select the Anita Borg scholars from our programs around the world. For more information on all our scholarships, visit the Google Scholarships site.

by Emily Wood (noreply@blogger.com) at May 21, 2013 01:00 PM

Managing Product Development

Devs in the ‘Ditch Slides Posted

I gave a talk at Devs in the ‘Ditch last week when I was in London. I posted the slides on slideshare: Overcoming Three Pitfalls of Transitioning to Agile.

The very nice people at 7digital made a video and posted it, too. If you can take the time, watch the entire video. Rob Bowyer gave a great talk about kanban and theory of constraints. My part about overcoming these three pitfalls starts at about 42 minutes in.

There are many other pitfalls to transition. This talk had just three of them: the stories are too big, you need experts to do the work, and you implement as layers instead of through the architecture.

I hope you enjoy the presentation and the video.

 

by Johanna Rothman at May 21, 2013 05:29 AM

Chris Siebenmann

A serious potential danger with Exim host lists in ACLs

Suppose that you have an Exim installation and you want to support some sort of source host based blocking (selective or otherwise) of incoming connections. The obvious way is to create an ACL section that looks something like this:

deny
    domains = +local_domains
    hosts   = ${if exists {UBLOCKDIR/hosts} {UBLOCKDIR/hosts}}
    message = mail from host $sender_host_address not accepted by <$local_part@$domain>.
    log_message = blocked by personal hosts blacklist.

(This one is a selective, per destination address host block list, hence the fun and games with UBLOCKDIR.)

This looks great and generally works but you've just armed a ticking time bomb, one that can blow your incoming email up with permanent temporary deferrals. The first problem is that Exim has no way in a host list to say 'this domain and any of its subdomains', in the way that the TCP wrappers '.host.com' will match both 'host.com' and 'fred.host.com'. If you want to match this case, the obvious way is to write two entries:

*.host.com
host.com

The first matches any subdomains of the domain; the second matches the domain itself. But you've just put the fuse in the bomb, because of just how plain host and domain names work in host lists. From the Exim specification with the emphasis being mine:

  • If the pattern is a plain domain name [...] Exim calls the operating system function to find the associated IP address(es). [...]

    If there is a temporary problem (such as a DNS timeout) with the host name lookup, a temporary error occurs. For example, if the list is being used in an ACL condition, the ACL gives a "defer" response, usually leading to a temporary SMTP error code.

So here's what happens. You list '*.spammer.com' and 'spammer.com' in your blocklist. Spammer.com turns off their DNS (or their DNS server turns it off because hey, they're a spammer) but doesn't de-register their domain, so DNS queries to their nominal authoritative DNS servers either don't get answers or get non-authoritative 'look elsewhere' results. Although this is a permanent condition, it's considered a temporary failure in DNS resolution. Exim now defers all SMTP connections that consult this host blocklist, regardless of where they are from. For ever, or at least until you notice.

Now that I've read the Exim documentation in detail, it spells out that you can turn this behavior off with the special option +ignore_defer. You probably want to do this. Certainly we do.

My feeling is that you want to do this for every host list anywhere except ones used for real, strong access control (which probably don't want to be using DNS names anyways). Consider, for example, a host list used for exceptions to greylisting; you probably don't want that ACL to defer the connection if you can't resolve a domain in it.

Sidebar: the other surprise in Exim host lists

Suppose that you have a host list like this:

*.spammer.com
192.168.0.0/16

Surprise: any connection from a host in 192.168/16 that does not have valid reverse DNS will not match the list. The moment you list a hostname wildcard in a host list, any IP address without a hostname automatically fails to match that entry or anything later in the list (or file if the list is in a file). It will match IP address patterns that are earlier in the list, though, so you get to remember to list all IP address patterns first. This behavior is documented if you read the documentation carefully.

Per the fine documentation this behavior can be turned off with +ignore_unknown. Now that I've found this, I need to make some configuration changes.

This is generally less dangerous than the host list defer time bomb, but it depends on what you're using the host list for. If you have a locked down configuration where you're using the host list for strong access control, well, you have potential issues here.

by cks at May 21, 2013 03:44 AM

Google Blog

Mario Testino to "The Scream" via Mark Rothko

Every day on the Art Project Google+ page we post a snippet of information about a painting, an artist or a talk—and every day, at least one of our 4 million followers has something to say in response. We’re constantly delighted by how the appetite for art online is growing and today we have a veritable feast in store with a swathe of fresh artworks, gigapixel paintings and museums on Street View.

New artworks from the famous to the unusual
Mario Testino is a world-famous photographer, known for his work in the fashion industry. Fewer people are aware of his photographs focusing on the culture of his native Peru. A new body of photographs called “Alta Moda” (high fashion), featuring Andean people in traditional and festive dress, is currently on display in Testino’s cultural institution, MATE. And for those of you not lucky enough to visit Lima, you can now see this collection of 27 photos online on the Google Art Project.


In total, we have more than 1,500 new high-resolution artworks including masterpieces such as Monet’s “Waterlilies,” Rembrandt’s “Portrait of a Man in a Broad-Brimmed Hat” and Johannes Vermeer’s “The Geographer” (meaning Art Project now houses 15 of his 34 total works, all contributed by different museums). However, the diversity goes well beyond paintings; from ancestral relics used to worship the dead to an ancient Jinsha gold mask from China thought to have been worn by sorcerers. Often the old contrasts with the new, with inscribed Arabic gemstones existing alongside contemporary glass structures from Germany as you can see in this “Compare” image below.


Zoom in to “gigapixel” paintings
Gigapixel paintings—very high-resolution works which enable you to zoom in at brushstroke level—have long been at the heart of the Art Project. They’re a great example of the magic that can happen when technology meets art—and today we have 16 new ones to add, ranging from famous pieces like “The Scream” by Edvard Munch to those chosen by public vote such as “Whitewashing the Old House” by L.A. Ring.

The beauty of gigapixels is their ability to surprise. Look at the painting “Fra Stalheim” by Johan Christian Dahl, shown in full on the left below. You’ll see a beautiful landscape. Zoom in, however, and you discover scenes within a scene—a village with smoking chimneys, a woman tending to her child, and cows grazing on the hillside. Details that can’t always be fully appreciated by the naked eye are brought to life online.


Immerse yourself in Street View
Through Street View and the Google Art Project, many museums have opened their galleries to the world the past few years, and today we’re launching 20 more. For example, Fondation Beyeler Museum in Switzerland houses a collection of seven Mark Rothko paintings. Now anyone in the world can virtually explore the collection.


Of course art collections are not exclusively found in museums—we’re delighted to have our first monastery on Street View in the Art Project. The Monastery of St. John the Theologian on the Greek island of Patmos was founded in 1088 and is a World Heritage Site. In addition to their 116 contributed artworks, you can also explore the architectural splendors of this ancient building.


Jump inside a whole range of beautiful buildings and corridors here by clicking on the orange pegman where it appears.

In a week that celebrates International Museum Day, we’re glad to be able to showcase some of the great treasures held by museums and cultural institutions the world over. There are so many benefits to bringing more content online, be it discovering a new style of art or artist, creating your own gallery, stumbling across a hidden detail of a painting you thought you knew or simply being inspired by something beautiful. With more than 40,000 total works and 250+ cultural organisations around the globe, we hope the experience will be more enriching than ever.

by Emily Wood (noreply@blogger.com) at May 21, 2013 02:00 AM

Linux Poison

eBook - A guide to programming Linux kernel modules

"The Linux Kernel Module Programming Guide"

A guide to programming Linux kernel modules.
An excellent guide for anyone wishing to get started on kernel module programming. The author takes a hands-on approach starting with writing a small "hello, world" program, and quickly moves from there.

Far from a boring text on programming, Linux Kernel Module Programming Guide has a lively style that entertains while it educates.

Download your free copy of "The Linux Kernel Module Programming Guide" -- here

by noreply@blogger.com (Nikesh Jauhari) at May 21, 2013 12:16 AM

May 20, 2013

Ubuntu Geek

Mixxx – The most advanced free DJ software

Mixxx is a DJ tool that allows for the playback and mixing of digital music (MP3, Ogg Vorbis, FLAC and Wave).The basic requirements for Mixxx are a desktop computer or laptop with a reasonable amount of storage space on the hard drive for your music, at least 1 audio card for outputting the sound and a way of controlling the software either by mouse, keyboard or hardware DJ Controller.
(...)
Read the rest of Mixxx – The most advanced free DJ software (188 words)


© ruchi for Ubuntu Geek, 2013. | Permalink | No comment | Add to del.icio.us
Post tags: , , , ,

Related posts

by ruchi at May 20, 2013 11:19 PM

Trouble with tribbles

Sparse root zones in Tribblix

Zones was one of the pillars of Solaris 10 (the others being DTrace, SMF, and ZFS). Lightweight virtualization enabled deployment flexibility and significant consolidation.

The original implementation was heavily integrated with packaging. In many ways, it broke the packaging system. In OpenSolaris and Solaris 11, packaging was completely replaced, the zone implementation is very different, but suffers from the same fundamental flaw - it's integrated at the heart of packaging.

Furthermore, sparse-root zones - where most of the operating system is shared between zones, with just configuration and transient files being unique to a zone - do not exist in the new world order, with each zone now being a separate OS instance. The downside to this, apart from requiring significantly more RAM and disk, is that you then have to manage many instances of the OS, rather than just the one.

In Tribblix, I have reimplemented sparse-root (and whole-root) zones, so that they look very similar to what you had in Solaris 10. The implementation is completely different, though, in that it expects zones to understand packaging rather than expecting packaging to understand zones.

Read here on how to create a sparse-root zone using Tribblix. What follows is some of the under-the-hood details of the implementation I've put together.

First, zone configurations are stored in /etc/zones. If you look on a system that supports zones you'll see a number of xml files in that directory. Some correspond to the zones configured on the system; others are templates. For a sparse-root zone in Solaris 10, there will be some inherited-pkg-dir entries. In the Tribblix implementation, these become simply loopback mounts, handled no differently than any other mount.

Then under /usr/lib/brand you will find a number of directories containing scripts to manage zones. Some of it is shared, some specific to a given brand. I've created a sparse-root and a whole-root brand, and created the scripts to build zones of the correct type.

The key script is called pkgcreatezone, which is the script called to actually populate an empty zone with the bits that will make it work. (It's not called that in Solaris 10 - there you'll find a binary that calls another binary from Live Upgrade to do the work. But in OpenSolaris and Tribblix it's just a script.)

For the ipkg brand, the pkgcreatezone script sets a bunch of IPS variables and creates an IPS image followed by a bit of cleanup. Really, it's nothing complicated.

For the sparse-root brand, you get the main /lib, /usr, /platform, and /sbin directories mounted from the global zone, so you can ignore those. Some standard directories you can simply create. And then all I do is cpio the /etc and /var directories into the zone's file system, and that's it. Well, not quite. I actually use the SVR4 contents file to provide the list of files and directories to copy, so that I don't start copying random junk and only have what's supposed to be there. And one advantage of SVR4 packaging here is that it saves a pristine copy of editable files, so I put that in the zone rather than the modified one. All in all, it takes a couple of seconds or so to install a zone on a physical system, which is far quicker than the traditional zone creation method.

I stumbled across an unfortunate gotcha while doing this. SMF manifests used to be in /var (which was always an odd place to put what are configuration files). They're now in /lib, which is again a very odd place to put configuration files. But this has the unfortunate consequence that, as /lib is loopback mounted into a zone, all the SMF manifests in the global zone will be imported, even though many of them are for services that aren't relevant to  a zone, and some of which flat out fail with errors. So what I had to do was create a clone of /lib, delete all the manifests that aren't relevant, and use that as the source for the zone (that's what the /zonelib directory is about, by the way).

When creating a whole-root zone, I simply cpio the /lib, /usr, /platform, and /sbin directories as well. (Cleaning up the SMF manifests as before.) So that takes a few minutes, but is a lot quicker than the old whole-root creation in Solaris 10.

Once I had the zone creation figured, and the /lib shuffle sorted, the remaining problem was zone uninstall. I haven't changed anything for this, but I did need a bit of extra work in system installation.

# beadm list -H
tribblix;51f2d0f4-df6e-6e48-dc0a-a74f37e14930;NR;/;3387047936;static;1361968342


What you see here is the output from beadm list -H. That second field is a UUID that uniquely identifies a boot environment. This is a ZFS property, named org.opensolaris.libbe:uuid, that's set on the ZFS dataset that corresponds to the root filesystem of the specified BE. If you create a zone, its file systems are tagged with the property org.opensolaris.libbe:parentbe that has the same value. When you uninstall a zone, it finds all the file systems that belong to the zone, and checks that they correspond to the currently running boot environment by comparing the UUIDs. I hadn't set this, so nothing matched and uninstall wasn't removing the zone file systems. In the future, the Tribblix installer will set that property and everything that needs it just works.

(As an aside, I ended up writing a quick and dirty script to generate the UUID, as Illumos doesn't actually have one. This is run in a minimalist install context, which I didn't want to bloat, so something that does a SHA1 digest of some data from /dev/random and mocks up the correct form does the trick nicely.)

So, the next release of Tribblix, the 0m6 prerelease, includes support for traditional whole-root and sparse-root zones. The point here isn't merely to simply replicate what's gone before, useful as that is. What this also shows is that, freed from the predefined constraints of a packaging system, you can generate completely arbitrary zone configurations, opening up a whole new array of possibilities.





by Peter Tribble (noreply@blogger.com) at May 20, 2013 09:27 PM

Sam Ruby

Prosody as a personal xmpp server

Nearly six years ago, I set up a personal Jabber server using ejabberd.  This setup survived the server migration to Ubuntu 8.04 and 10.04.  This past weekend, I attempted to migrate that to a server running 12.04 and all I could get out of it was an erlang crash dump.

A quick scan for successors turned up prosody. Configuration was as simple as adding a VirtualHost and setting allow_registration to true.

May 20, 2013 05:29 PM

my other pc is a cloud

Active Directory List Object Mode

This is something I've been wanting to blog about for a long time, but have been putting it off because I knew it might turn in to a long, time-consuming post. Well it's time to bite the bullet and get started.

We were facing a bit of a problem in one of our managed hosting environments. We had this high-volume, multitenant Active Directory being used by dozens of different customers. There was a business requirement in this domain that customers not be able to read from one another's organization units for the sake of the mutual privacy of the customers. Things seemed to be working well for a while, but one day, it appeared that customer users logging on to many of the client computers were failing to process Group Policy upon logon:

Event ID: 1101
Source: Userenv
User: NT Authority\System
Description: Windows cannot access the object OU=Customers, DC=contoso, DC=com in Active Directory. The access to the object may be denied. Group Policy processing aborted.

To start troubleshooting, I copied one of the affected user accounts and used it to log in to one of their machines, and I was able to reproduce the issue. Upon trying to update Group Policy with gpupdate.exe, I noticed that the computer configuration was updating fine, while only the user portion of the update failed, and the event 1101 was produced.

The basic layout of the OU structure in the domain was this:

    
CONTOSO.COM
    |
    + Customers (OU)
          |
          + Customer1 (OU)
          |
          + Customer2 (OU)
          |
          + ...

Still using my customer-level user account, I noticed that I was able to browse the contents of my own Customer1 OU, but I was not able to browse the contents of any other OU. The permissions on these OUs had certainly been modified.

In fact, it was that the read permission for the Authenticated Users security group had been removed from the access control list on the Customers OU. That explains the event 1101s and the GPO processing failures. From Microsoft:

[GPO processing fails] when the Group Policy engine cannot read one of the OUs.

The Group Policy engine must be able to read all OUs from the level of the user object or the computer object to the level of the domain root object. Also, the Group Policy engine must be able to read the domain root object and the site object of the computer. This is because these objects may contain links to group policies. If the Group Policy engine cannot read one of these OUs, the events that are mentioned in the "Symptoms" section will be logged.

So in satisfying the business requirement that no customer be allowed to list the contents of another customer's OU, Group Policy processing had been broken. But simply giving Authenticated Users their read permissions back on the Customers OU, they get to browse all the other customers OUs as well.

We needed the best of both worlds.

This Microsoft article would lead you to believe that if a security principal just had the Read gpLink and Read gpOptions access control entries, then GPO processing should work fine:

But that's not enough. The four ACEs that were needed on the Customers OU were:

  • Read gpLink
  • Read gpOptions
  • Read cn
  • Read distinguishedName

Now we're making progress, but we're still not out of the woods. Giving Authenticated Users the List Contents permission on the Customers OU would allow them to see the names of all the other customer's OUs, although now they show up as "Unknown" object types and can't have their respective contents listed. But that's a messy solution in my opinion and doesn't fully satisfy the requirement. Customer1 shouldn't even be aware of Customer2's existence.

There's one last piece of the puzzle missing, and that brings me to List Object Mode.

List Object Mode is one strategy available to Active Directory administrators to allow for hiding certain bits of data from certain users. List Object mode has to be enabled manually; it's turned off by default. To enable it, set the value of the dsHeuristics property in the Configuration partition to 001 using ADSI Edit, like so:

dsHeuristics

Now you will have a new access control entry in the list on objects in your forest: List Object. The ACE was actually there before, but Active Directory doesn't enforce it by default.

List Object Mode is a form of Access Based Enumeration, (not to be confused with file system ABE,) where items are not displayed to users that do not have List Object permissions to them. By default, when a user has the List Contents permission on an OU, and queries that OU, he or she is given a list of all child OUs in that parent OU, even if the user doesn't have read access to those other child OUs.  They show up in ADUC as "Unknown" object types and get that little blank page for an icon which is the Microsoft universal symbol for "wth is this?"

By using List Object permissions after having enabled it as just described, Active Directory evaluates the permissions of all the child objects under the object that was queried before returning the results to the user. Unless the user has the List Object permission on the object, it is omitted from the results. So now we have a customer user who is able to read just his or her own OU, and the other Customer OUs are completely hidden from view.

And no more Group Policy failures due to access denied, either.

So are there disadvantages to enabling and using List Object mode in your domain? Yes there are. So even though it may be appropriate for your environment, List Object Mode is not for everybody and it's not a decision that should be made lightly:

  • Significantly increased access control checks on LDAP queries = busier domain controllers.
  • You may need to rethink your entire User and Computer organization strategy to accommodate for how the new permissions work.
  • It's a less common configuration that fewer people are familiar with. Administrative complexity++. You need to fully document the change and make sure every administrator is aware of it.

So there you have it. Now go impress your friends with your knowledge of AD List Object Mode!

by ryan@myotherpcisacloud.com at May 20, 2013 05:00 PM

bc-log

Securely backing up your files with rdiff-backup and sudo

Backups are important, whether you are backing up your databases or your wedding pictures. The loss of data can ruin your day. While there is a huge list of backup software to choose from; some good, some not so good. One of the tools that I have used for years is rdiff-backup.

rdiff-backup is a rsync delta based backup tool that both stores a full mirror and incremental changes. It determines changes based on the rsync method of creating small delta files, which allows for rdiff-backup to restore files to any point in time (within the specified retention period).

In the examples below I will refer to two servers names, backup-server and server. The names are pretty self-explanatory but just in case, backup-server is the location where I permanently store files copied (backed up) from server.

Setting up rdiff-backup

Installing rdiff-backup is easy considering most Linux distributions include it into their default repositories. In this article I will be using Ubuntu for my example systems.

Note: For Red Hat you will need to enable the EPEL repository to install rdiff-backup via YUM.

Installing

In order for rdiff-backup to work both the source and destination will require the rdiff-backup package. You can install it via apt-get.

On backup-server:

root@backup-server# apt-get install rdiff-backup

On server:

root@server# apt-get install rdiff-backup

Validate rdiff-backup versions match

One of the quirky things about rdiff-backup is that the tool does not support backwards capability with older versions. For this reason it is best to make sure that your rdiff-backup versions are the same on both servers.

On backup-server:

root@backup-server# rdiff-backup --version
rdiff-backup 1.2.8

On server:

root@server# rdiff-backup --version
rdiff-backup 1.2.8

Setting up SSH Keys

By default rdiff-backup uses SSH to communicate with remote systems to avoid typing a password every time rdiff-backup runs we will need to set-up SSH keys with passphrase-less authentication.

On backup-server:

root@backup-server# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.

When asked leave the passphrase empty.

Once you have the SSH key generated you will need to copy the contents of /root/.ssh/id_rsa.pub to the remote servers for key-based authentication. For our configuration we will use a non-privileged user account (test), as this will let us implement rdiff-backup without giving the backup-server full access to the systems being backed up.

On backup-server:

root@backup-server:# scp /root/.ssh/id_rsa.pub test@server:/var/tmp/id_rsa.pub.temp

On server:

test@server:$ cat /var/tmp/id_rsa.pub.temp >> ~/.ssh/authorized_keys

You should now be able to SSH from backup-server to server without being asked for a password.

Running backup jobs

Now that backup-server is able to SSH to server without being asked a password and rdiff-backup is the same version on both systems we are able to perform the first backup.

The directory we will backup today is /var/tmp/backmeup and we will be backing it up to /var/tmp/backups/server.example.com/. I personally prefer to backup to a directory named after the originating server, that way there is no question as to where the files came from.

On backup-server:

root@backup-server:# mkdir -p /var/tmp/backups/server.example.com
root@backup-server:# rdiff-backup test@server.example.com::/var/tmp/backmeup /var/tmp/backups/server.example.com/

rdiff-backup has now created a mirror of the /var/tmp/backmeup directory from server.example.com in /var/tmp/backups/server.example.com.

root@backup-server:# ls -la /var/tmp/backups/server.example.com/
total 52
drwxr-xr-x 3 root root 4096 May 19 13:07 .
drwxr-xr-x 3 root root 4096 May 19 13:53 ..
-rw-r--r-- 1 root root   25 May 19 13:07 10.file
-rw-r--r-- 1 root root   24 May 19 13:07 1.file
-rw-r--r-- 1 root root   24 May 19 13:07 2.file
-rw-r--r-- 1 root root   24 May 19 13:07 3.file
-rw-r--r-- 1 root root   24 May 19 13:07 4.file
-rw-r--r-- 1 root root   24 May 19 13:07 5.file
-rw-r--r-- 1 root root   24 May 19 13:07 6.file
-rw-r--r-- 1 root root   24 May 19 13:07 7.file
-rw-r--r-- 1 root root   24 May 19 13:07 8.file
-rw-r--r-- 1 root root   24 May 19 13:07 9.file
drwx------ 3 root root 4096 May 19 13:56 rdiff-backup-data

Now that we have backed up the original file we will run a second backup to capture changed data; this time a with a little more verbosity.

root@backup-server:# rdiff-backup -v5 test@server.example.com::/var/tmp/backmeup /var/tmp/backups/server.example.com/
Using rdiff-backup version 1.2.8
Executing ssh -C test@server.example.com rdiff-backup --server
<truncated for length>
Backup: must_escape_dos_devices = 0
Starting increment operation /var/tmp/backmeup to /var/tmp/backups/server.example.com
Processing changed file .
Incrementing mirror file /var/tmp/backups/server.example.com
Processing changed file 1.file
Incrementing mirror file /var/tmp/backups/server.example.com/1.file
Processing changed file 10.file
Incrementing mirror file /var/tmp/backups/server.example.com/10.file
Processing changed file 2.file
Incrementing mirror file /var/tmp/backups/server.example.com/2.file
Processing changed file 3.file
Incrementing mirror file /var/tmp/backups/server.example.com/3.file
Processing changed file 4.file
Incrementing mirror file /var/tmp/backups/server.example.com/4.file
Processing changed file 5.file
Incrementing mirror file /var/tmp/backups/server.example.com/5.file
Processing changed file 6.file
Incrementing mirror file /var/tmp/backups/server.example.com/6.file
Processing changed file 7.file
Incrementing mirror file /var/tmp/backups/server.example.com/7.file
Processing changed file 8.file
Incrementing mirror file /var/tmp/backups/server.example.com/8.file
Processing changed file 9.file
Incrementing mirror file /var/tmp/backups/server.example.com/9.file

As you can see -v5 tells us what files are being processed, this is handy to see what is being backed up or being restored.

Now if we only change files 1 – 3 and run rdiff-backup again rdiff-backup should only backup files that have changed leaving the others alone.

root@backup-server:# rdiff-backup -v5 test@server.example.com::/var/tmp/backmeup /var/tmp/backups/server.example.com/
Using rdiff-backup version 1.2.8
Executing ssh -C test@server.example.com rdiff-backup --server
<truncated for length>
Starting increment operation /var/tmp/backmeup to /var/tmp/backups/server.example.com
Processing changed file .
Incrementing mirror file /var/tmp/backups/server.example.com
Processing changed file 1.file
Incrementing mirror file /var/tmp/backups/server.example.com/1.file
Processing changed file 2.file
Incrementing mirror file /var/tmp/backups/server.example.com/2.file
Processing changed file 3.file
Incrementing mirror file /var/tmp/backups/server.example.com/3.file

If we look at the backup directory the number of files has not changed, however the contents and time stamps have.

root@backup-server:# ls -la /var/tmp/backups/server.example.com/
total 52
drwxr-xr-x 3 root root 4096 May 19 13:07 .
drwxr-xr-x 3 root root 4096 May 19 13:53 ..
-rw-r--r-- 1 root root   76 May 19 14:10 10.file
-rw-r--r-- 1 root root   98 May 19 14:16 1.file
-rw-r--r-- 1 root root   98 May 19 14:16 2.file
-rw-r--r-- 1 root root   98 May 19 14:16 3.file
-rw-r--r-- 1 root root   73 May 19 14:10 4.file
-rw-r--r-- 1 root root   73 May 19 14:10 5.file
-rw-r--r-- 1 root root   73 May 19 14:10 6.file
-rw-r--r-- 1 root root   73 May 19 14:10 7.file
-rw-r--r-- 1 root root   73 May 19 14:10 8.file
-rw-r--r-- 1 root root   73 May 19 14:10 9.file
drwx------ 3 root root 4096 May 19 14:16 rdiff-backup-data

rdiff-backup will keep the current mirror unchanged and any differences will be kept in diff files within the rdiff-backup-data directory. It is not advised to modify or interact with the mirror or diff files directly, it is better to use the rdiff-backup command itself.

Listing available backups

To see the available backups we can use rdiff-backup -l.

root@backup-server:# rdiff-backup -l /var/tmp/backups/server.example.com/
Found 5 increments:
    increments.2013-05-19T13:56:57-07:00.dir   Sun May 19 13:56:57 2013
    increments.2013-05-19T14:09:52-07:00.dir   Sun May 19 14:09:52 2013
    increments.2013-05-19T14:11:29-07:00.dir   Sun May 19 14:11:29 2013
    increments.2013-05-19T14:16:44-07:00.dir   Sun May 19 14:16:44 2013
    increments.2013-05-19T14:29:38-07:00.dir   Sun May 19 14:29:38 2013
Current mirror: Sun May 19 14:30:20 2013

If a file has been deleted and rdiff-backup has ran since the file deletion you may not find the file in the directory, you can still however list the available backups for that file by specifying it as if it did exist.

 root@backup-server:# rdiff-backup -l /var/tmp/backups/server.example.com/1.file
Found 4 increments:
    1.file.2013-05-19T13:56:57-07:00.diff.gz   Sun May 19 13:56:57 2013
    1.file.2013-05-19T14:09:52-07:00.diff.gz   Sun May 19 14:09:52 2013
    1.file.2013-05-19T14:11:29-07:00.diff.gz   Sun May 19 14:11:29 2013
    1.file.2013-05-19T14:16:44-07:00.snapshot.gz   Sun May 19 14:16:44 2013
Current mirror: Sun May 19 14:30:20 2013

Restoring backed up files and directories

rdiff-backup has the ability to restore either individual files or entire directories, as long as rdiff-backup has the item within its incremental lists.

Restoring an individual file

When restoring an individual file with rdiff-backup you can either specify a time or the incremental file to restore from. For  the following example I will show using the incremental file.

root@backup-server:# cd server.example.com/rdiff-backup-data/increments/
root@backup-server:# rdiff-backup -v5 1.file.2013-05-19T14\:11\:29-07\:00.diff.gz test@server.example.com::/var/tmp/backmeup/1.file

Restoring a directory

When restoring a directory however we will need to specify a specific time that we want to restore to.

root@backup-server:# rdiff-backup -v5 -r 1h server.example.com/ test@server.example.com::/var/tmp/backmeup

This command will restore the entire directory to where it was 1 hour ago or best it can depending on the backups available. rdiff-backup can support many time frames but I commonly find myself using the xDays format (e.g. 2D for 2 days).

Don’t use the force flag

While the above command will restore the whole directory it will only do so if the directory is empty. If the directory has files in it and you ask rdiff-backup to restore that directory than it will try to remove the existing files in order to match your backup. This action could result in data that has not been backed up being removed.

To protect against accidental deletion rdiff-backup requires the force flag to be used anytime a file is being overwritten or deleted.

root@backup-server:# rdiff-backup -v5 -r 1h server.example.com/ test@server.example.com::/var/tmp/backmeup
Using rdiff-backup version 1.2.8
Executing ssh -C server.example.com rdiff-backup --server
Fatal Error: Restore target /var/tmp/backmeup already exists, specify --force to overwrite.

I advise avoiding the use of the force flag whenever possible, if you truly do not want the contents of the directory than just remove them manually before restoring. I have seen many times where people used the force flag and accidentally overwrote a directory they did not mean (like /etc/ for example…).

Restoring to another location

When restoring with rdiff-backup you can restore files or directories to a location other than their originating source. This can be handy if you need to check the contents before completely restoring the file.

root@backup-server:# rdiff-backup -v5 -r 3h server.example.com/1.file test@server.example.com::/var/tmp/backmeup/1.file.restore

Backup Retention

Backups are only as good as their retention period, without a retention period you will eventually run out of disk space or use far more disk space than you had originally planned. rdiff-backup has the ability to maintain a certain number of incremental copies. With rdiff-backup you can tell it to either keep a backup for a certain amount of time or for a certain number of backups.

On backup-server:

Time method

The time method uses the same time format as restore.

root@backup-server:# rdiff-backup --force --remove-older-than 4h /var/tmp/backups/server.example.com

Number of backups method

To specify a number of backups use the number followed by a capital B.

root@backup-server:# rdiff-backup --force --remove-older-than 4B /var/tmp/backups/server.example.com

I used the force flag with the above commands as rdiff-backup requires force to be given if you are removing more than one incremental copy.

Providing more access with sudo

So far we have been backing up files and directories that the test user has access to; if we were to try and backup or restore a file that the test user does not have access to than the backup/restore will fail with a permission denied. To provide greater access you can either run rdiff-backup as the root user on the remote systems (which raises security concerns), or provide the test user with the ability to run rdiff-backup as the root user via sudo.

Example of permission denied error:

root@backup-server:# rdiff-backup -v5 test@server.example.com::/var/tmp/backmeup /var/tmp/backups/server.example.com
Using rdiff-backup version 1.2.8
Executing ssh -C test@server.example.com rdiff-backup --server
Exception '[Errno 13] Permission denied: '/var/tmp/backmeup'' raised of class '<type 'exceptions.OSError'>':

Adding the rdiff-backup into /etc/sudoers

In order to allow the test user the ability to run rdiff-backup as root we need to add an entry into the /etc/sudoers file, which controls what commands users can run via sudo. To modify this file we will use the visudo command.

On server:

root@server:/var/tmp# visudo

Append:

## Give test user the ability to run rdiff-backup
test    ALL = NOPASSWD: /usr/bin/rdiff-backup --server

As the test user you will now see rdiff-backup in the list of available sudo commands

test@server:~$ sudo -l
User test may run the following commands on this host:
    (root) NOPASSWD: /usr/bin/rdiff-backup --server

We are specifying NOPASSWD as by default sudo would normally ask the user for their password, which would not work very well with an automated backup script.

Running rdiff-backup with remote-schema

In order for rdiff-backup to use sudo we will need to change the command we have been using a bit; we will use the –remote-schema flag to tell rdiff-backup to run “sudo /usr/bin/rdiff-backup –server” on the remote system.

On backup-server:

Backup command

root@backup-server:# rdiff-backup -v5 --remote-schema 'ssh -C %s "sudo /usr/bin/rdiff-backup --server"' \
test@server.example.com::/var/tmp/backmeup /var/tmp/backups/server.example.com

<truncated>
Processing changed file 9.file
Incrementing mirror file /var/tmp/backups/server.example.com/9.file

Restore command

root@backup-server:# rdiff-backup -v5 -r 3h --remote-schema 'ssh -C %s "sudo /usr/bin/rdiff-backup --server"' \
/var/tmp/backups/server.example.com/5.file test@server.example.com::/var/tmp/backmeup/5.file

By adding sudo we are allowing the test user to backup and restore any file on the system with rdiff-backup.

Adding restrict-read-only for even more security

While using rdiff-backup with sudo prevents people from using the SSH key to login as root to all of our remote systems. This solution by itself does not restrict someone from using rdiff-backups restore function from deploying compromised files.

For even more security we can use the –restrict-read-only flag to restrict rdiff-backup to only being able to read files and blocking all write requests. The down side of this setting is that it also prevents valid restore requests as well. If you are more worried about someone accessing your systems than having to edit the sudoers file every time you want to restore a file; than this is a good option.

Adding restrict-read-only to the sudoers entry

In order to add –restrict-read-only we need to add it to both the rdiff-backup command and the sudoers entry.

root@server# visudo

Modify to:

test    ALL = NOPASSWD: /usr/bin/rdiff-backup --server --restrict-read-only /

The / at the end is the path that you want rdiff-backup to be restricted to. This entry would give rdiff-backup the ability to backup all files on the system. If you are not backing up the entire system you can restrict this to a specific path as well to prevent rdiff-backup from reading other files on the system not within your path.

Running the backup command with restrict-read-only

Now that sudo allows us to run the full command we can add it to the remote-schema.

root@backup-server:# rdiff-backup -v5 --remote-schema 'ssh -C %s "sudo /usr/bin/rdiff-backup --server --restrict-read-only /"' \
 test@server.example.com::/var/tmp/backmeup /var/tmp/backups/server.example.com
Using rdiff-backup version 1.2.8
Executing ssh -C test@server.example.com "sudo /usr/bin/rdiff-backup --server"

If you modified the path in the sudoers file you would need to do the same with the rdiff-backup command above.

Automating with Cron

Automating rdiff-backup with cron is as simple as tossing the commands above into a script and adding it to the crontab. The below is meant only for example, I would advise anyone reading this to script in some more intelligence to handle failed backups and concurrent runs but if you needed something quick and dirty this would work.

On backup-server:

Creating the backup script

root@backup-server# vi /root/backup-example.sh

Add:

#!/bin/bash
## Example rdiff-backup script - http://bencane.com
## This is not fancy, and you should really add error checking

# Backup
rdiff-backup -v5 --remote-schema 'ssh -C %s "sudo /usr/bin/rdiff-backup --server --restrict-read-only /"' \
 test@server.example.com::/var/tmp/backmeup /var/tmp/backups/server.example.com

# Clean Increments
rdiff-backup --force --remove-older-than 4B /var/tmp/backups/server.example.com

Adding to crontab

Once you have the script you can simply add the script into the crontab on the backup-server.

root@backup-server# crontab -e

Append:

# m h  dom mon dow   command
0 0 * * * /root/backup-example.sh > /dev/null 2>&1

The above crontab entry will run backup-example.sh every night at midnight. This will provide you with 4 days of incremental copies at all times.

Tags: , , , , , , ,

by Benjamin Cane at May 20, 2013 04:10 PM

/sys/admin/blog

HBR on what value creation will look like in the future

What value creation will look like in the future:  http://blogs.hbr.org/cs/2013/05/what_value_creation_will_look_like_in_the_future.html

A teaser from the article:

“Organizations have nearly perfected implementing the industrial model of managing work — the effort applied toward completing a task. For individuals, this model ensures that we know what we’re supposed to do each day. For organizations, it guarantees predictability and efficiency. The problem with the model is that work is becoming commoditized at an increasing rate, extending beyond manual tasks into knowledge work, as data entry, purchasing, billing, payroll, and similar responsibilities become automated. If your organization draws value from optimizing repetitive work, you’ll find that it will be increasingly difficult to extract that value.”

What you can do:

  • Master the machines.
  • Get obsessed with value.
  • Make creativity real.

 

by Joe at May 20, 2013 03:08 PM

Standalone Sysadmin

Advancing Women in Computing - Panelists Needed!

I got an email from a friend of mine who is soliciting for women who work in IT (preferably IT administration) to take part in a panel at LISA'13 called "Advancing Women in Computing". You can also watch last year's panel to get a feel for what it's like.

Once again, my good friend Rikki Endsley will be moderating the (probably) 90 minute session. They are giving preference to women in the Washington DC area (or people who are going to be attending LISA anyway), so if you're in that region and this sounds like something that interests you, email lisa13gurus@usenix.org (or drop a line here, and I'll get the message to them).

Thanks!

by Matt Simmons at May 20, 2013 02:53 PM

Rich Bowen

What I've learned at SourceForge

Today I'll be leaving SourceForge and taking a role at RedHat. Please don't think for a moment that it's because I don't like SourceForge. I continue to think that SourceForge does community *way* better than either Github or Google Code, and while there are places where the platform can improve, the team that's working on it is one of the finest bunch of engineers I've ever had the privilege of working with.

Here's a few of the many things I've learned at SourceForge.

People are passionate

Every time I talk to anybody about my job, I mention two projects: PonyKart and OpenMRS. These projects illustrate to me how people can be passionate about anything. Having talked with the leads of both of these projects, I'm blown away by their passion for excellence.

Of course, these projects could hardly be more different.

PonyKart is a My Little Pony themed Mario-Kart style game. It's fun. The physics are well done. The courses are well designed. The community is very engaged. And it has My Little Pony characters in it. The guys that did this project wanted it to be a MLP game, but they also wanted it to be excellent. They wanted it to be fun. They wanted it to be *good*. They are passionate about it.

The OpenMRS project is a medical records system that was developed for a hospital in Kenya that had a hacked-together Access database monstrosity, and it was faster and easier for these guys to hack something together than to try to fix what was there. But that wasn't enough. They were passionate. They wanted it to be done right, and they wanted hospitals all over the world to benefit from it. And now they have a non-profit dedicated to giving this product away to hospitals in developing nations that need it. These guys are my heroes.

I am continually blown away by the quest for excellence, and the vast range of ways that it manifests itself.

People are kind

I've met amazing people in my time at SourceForge. These people are helpful, kind, patient, and, as I've mentioned, passionate. For the most part, people get that I'm human and can't solve all of their problems immediately. They get that we all have the limitation of time and resources.

Most people *don't* throw tantrums or demand their way. For this I am very grateful. I'm glad to have met a few of the nice people.

People are cruel

Sure, SourceForge is the underdog right now. I get that. It's not necessary to be a jerk.

It's hard to remember, when people are being jerks, that they're in the minority. Most people are, in fact, nice. But the jerks are very loud.

I'd like to remind the jerks that the folks who happen to be developing their project on the SourceForge platform are passionate, and they are pragmatic, and they are doing something useful while you fling mud at them.

'nuff said.

People are pragmatic

Tools are tools. They are not your children.

For the most part, people want to get a job done, and they use the tools they have, because the focus is the task, not the tools. Once, we used CVS and MailMan and we *liked* it. SVN is better. Some people like Git better. But if we had to use CVS and MailMan, you know what? We'd still get stuff done.

Religious debates over the relative merits of DVCS and CVCS systems are all well and good over beer at conferences, but most of us have a job to do, and we don't have time for that indulgence. You may, in fact, be right, but I don't have that kind of time.

I grow very weary of the This vs That flame wars that have characterized the IT world for so long. Perl vs Python, VI vs Emacs, Linux vs Windows vs Mac, Git vs SVN. The thing is, if you're a professional, you need to know *all* of them, and you're not coming across as brilliant, you're coming across as only knowing one tool. Nice hammer. Sometimes a screwdriver is useful.

But, much as most people are nice, it turns out most people are pragmatic. Most people don't have time for those debates either. They want to get their job done. I really appreciate having met a lot of those kinds of people.

by rbowen at May 20, 2013 12:05 PM

The Nubby Admin

Lessons Learned from Cascading Failure and Face Punching a House

In regard to my post “Cascading Failure, Technical Debt, and Punching a House with my Face“, I was asked about my conclusions and how I dug myself out of that hole.

Before I go any further, let me confess that the blog post itself was another failure in that saga. I began writing it the day I discovered the a forgotten boot DVD in the optical drive was the cause of the server not coming back up, and I continued adding to the post over the next several days. Because I didn’t give a contiguous block of time to the writing, I left out some details. Furthermore, I posted it too soon because I was absent minded as I was saving the post and accidentally published it instead of saved it. I had to quickly unpublish it, however it was too late; an email notification went out to my subscribers so the unfinished article was read by a few people. Then when they clicked to go to the blog, they were met with a 404 error. Finally, I continued to write the post, scheduled it to be published the next day, and then completely forgot to polish it off because I staggered away from my computer to hit the treadmill, shower, and then collapse into bed. I woke up the next day, my computer still on, my desk still in “work mode” with items scattered all over it, and the WordPress control panel still open with the post being edited. The failure train was still going full steam.

If you’ve read the previous blog post, you might want to read it again because it’s now a little better presented with some key points added that I had left out.

The conclusions that I came to as a result of the Circus of Calamity that happened over Mother’s Day weekend are nothing new to me, and very likely nothing new to you. However I think they bear enough fruit to be written down. I’d like to codify my thoughts on technical debt in the system administration world and how to avoid it or deal with it if you’re currently in over your head. Mayhaps this will be the first effort in a larger work.

Without giving too much thought to the order in which I think the following tenants are valued at, here are some of the lessons that stand out to me.

Lessons learned:

Devote contiguous time to projects. In this case, my client gives me an hour cap. I can work on their systems for a certain amount of hours per month. That naturally lends itself to non contiguous blocks of time as I hit my maximum and then wait for the month to roll over. However, even with that limit, it’s best to spend it all in a largely uninterrupted segment of time. Regardless of if you are salaried, full time contracted, or an independent hired gun that works for whoever needs you whenever they can pay you, the concept remains the same.

If you have something to do, devote as much contiguous time to performing a task as you can. In my cast I tend to carve my day up thusly: four or five hours on one client’s systems, then another four or five hours on another client, and then perform three or four hours of tasks that I need to do on my own systems, bookkeeping, and general business busywork. This leeds to long days of frequent context shifting. In my last post’s situation, that led to errors like leaving a DVD in the optical drive of a server as well as not copying over the client’s password store on a schedule. The last post of mine was even a victim; as I context shifted too frequently, I forgot that I still had some polishing to do on it.

Don’t break your thoughts up. Think on a specific task, task set, and/or client for as long as possible with as few interruptions and context changes as possible.

Fix problems as they come. “I’ll get to that later” is death. The problem with the ILO needed to be addressed immediately. The problem with the BIOS clock resetting after power state changes needed to be addressed immediately. The problem with the hard drive controller failing needed to be addressed immediately. These were emergency level things that got distracted by the peculiarities of being an independent consultant and working for a place that has very small project budgets. Or perhaps that had nothing to do with it. Perhaps I could have pushed harder and taken more initiative. I know this office well and could have ordered the BIOS battery and contacted a local contractor to walk in and replace it. Act first, bill later! After all the years working with this group, I’m fairly certain that such an emergency action would be accepted and paid with no question.

Regardless of my specific situation, the idea is that, with few exceptions, one needs to address problems that crop up at that time, and not later. This is a sister concept to devoting contiguous time to a project. Keep on with solving a problem, and its directly related troubles for as long as possible without interruption. That can mean hours, or days, or longer as possible. This can lead to a rabbit-hole scenario where one simple change then leads to a huge infrastructure change. However, simply glossing over an issue adds another mound of debt to the overall systems debt. Don’t know why DNS queries are taking five seconds to resolve? Eh, it’s just a few seconds to wait and we’ve got bigger issues to solve with the payroll application. However, when a problem crops up as a result of DNS not being as smooth as one would expect, now you’ve got to face the DNS problem with another problem on your back. That pressure may lead to greater problems by encouraging you to implement half-measure solutions for the DNS resolution delay, which then causes another problem, which then causes another, and another… etc. and etc.

This is rather hard for consultants like me, however, so this bad tendency is strengthened. Consultants are paid by the hour in most cases, so the pressure to deliver without drawing out billable time is great. If a flat-rate project quote is made, you’re always one step away from hitting a mine field if you go too deeply into ancillary systems. I’ve lost my shirt as a result of a flat-rate project quote that ended up with scope-creep. While once in a while chasing down each problem to its root turns out great because you get extra business to fix those systems, most of the time it causes much more pain and suffering in the form of unpaid invoices, broken systems, and angry business owners.

This also seems rather tough for any IT person in general. At any given moment we’ve got large amounts of projects and rooms full of executives all competing for our time. Each one thinks they’re the most important person and project. Each one wants to be completed yesterday. There comes a point when being driven by the tyranny of the urgent has to stop, one way or another. The balance of when to chase down a problem to completion is a fine one, but I think we should collectively assume that a problem should be fixed immediately and require strong evidence to the contrary before abandoning the pursuit.

Get rest and stay healthy. I’ve been grinding hard for far too long. Starting a business is no joke. Keeping the business alive with paying clients all the while sharpening your skills and keeping potential clients engaged just in case existing business leaves is even less funny. I’ve been working so many hours in a week for three straight years that I’m aging myself prematurely. I’d love it if I could take a vacation or relax more, but the truth of the matter is that the clients I’ve picked up haven’t been the most lucrative and there have been some billing and invoicing… issues. I don’t have the time or the money to do much else with life except clatter away in front of a computer.

I’ve been departing in the evenings to get back to a hobby that has always interested me: weight lifting. That’s helped, and I’m contentedly regaining strength and getting re-bitten by the lifting bug and the addiction to “the pump”, but this is a fairly new re-committment. The plain facts are that I’m tired and zapped of mental energy. I’ve been grinding and it shows. The dumb mistakes I made for the client in the last post are in some part a result of mental exhaustion. I’ve lost some of the love for information technology that I used to have. I’ve visibly aged ten years in just three and I don’t have much material gain to show for it. I’m tired, and I did a disservice to my best client because of it. Shame on me.

(That said, if anyone knows of a business known for paying market hourly rates, and on time, that needs an independent system administrator with my skills based in Phoenix, Arizona, I’ve now got some time that I can book for a new client. Please, no employment positions at this point. Only consultant / contractor.)

Kanban can help! Kanban – I lurves it. I’m not about to suggest that it is the solution to everyone’s problems, but for those of us who are more tactile and visual, this kind of project and task management system can really make a difference. Kanban, in the simplest explanation that I’ve come up with so far, is a means of visualizing work and encouraging a limit to concurrent work.

In the cases of carving out contiguous time, fixing problems at the earliest possible moment after discovery, and even staying rested, kanban can be very useful since it forces you to be constantly aware of what work is currently being performed and what work is waiting to be performed. I use it to break larger projects into smaller chunks. Typically if a task would take more than four hours then it needs to be broken down into more than one ticket. However, in some cases I simply write “Work on X project, 4 hours” and that’s enough. But I’m digressing into the specifics of kanban when that’s not the point here.

Kanban as a means of staying on target and being ever aware of what context you’re currently in can be a huge boon to the hamstrung IT person. It’s especially helpful if you work with others and keep the kanban board highly visible. That way people always know what you’re working on. It’s very helpful to have management buy-in to that kind of system. Why? Imagine you’re in an environment where everyone thinks their projects are of utmost importance. If you limit your concurrent working projects to one or two like a good kanban system suggests, you can point to the board, specifically the area that is dedicated to tasks that you are currently working on this very moment, and force a choice. When someone complains about their stuff not getting done, then, with the blessings of leadership, you can require that the person who wants their project to be given top priority, contact the task owner of the current project that’s being worked on and explain to them why that has to be shelved and how long it will take for you to get back to work on it.

This can really help. Everyone understands sticky notes on a white board. If Gilles in accounting thinks that his project is the most important thing, tell him he can move James’s ticket from the currently worked-on project over to the holding tank. You know James. Six feet eight inches of security guard whose beard has more muscle than your entire body? Good luck, Gilles! But seriously, putting things into perspective for various project owners can seriously help you block off contiguous amounts of time and stay on track.

Sadly, in my case as an independent consultant, I can’t make Client A call Client B and explain why Client B needs to let up on using my time. It doesn’t work that way. However, I can still use kanban to aid in my own self discipline of keeping track of what I’m working on and when.

Digging out of the Hole

Some people have asked how I’ve dug out of that hole with the client. I’m still working on it. Meanwhile, in addition to that client, I’ve got a few other projects that I’m trying to sew up, so it’s a delicate balance of time and guarding contiguous working hours on each project. I can tell you that when I do get out of the hole of technical debt, it will be in large part due to kanban and a good set of targeted goals.

I need to first create a large vantage point for all my major task spheres. Currently I have three clients that I’m working with. One is a bit of a deadbeat. One is low priority. One is high priority (the one with the crashing server). In my personal life, I’ve got a few large goals to be concerned with as well. I’ll make those large circles of potential task lists and then figure out what is most important to me at this point in my life.

I can tell you that the high priority client (who also pays on time and is in good standing) will be at the top of the list of work projects. My low priority client will be a close second because the project is smaller and close to completion. It will be a great relief to get them finished. The client who is late paying is down on the list of priorities. Even them paying up their overdue invoices won’t get them past third spot until I see a history of on-time payments and not wanting cut-rate hourly rates.

Within each task sphere will be a list of tasks ordered by importance. Of highest importance for the client with the failing server will be to get new equipment in and start the migration. That much is obvious. I’ll need to address each quirk and hiccup along the way, such as ILOs dropping off the network and the like. Furthermore, I’ll need to guard contiguous time. Instead of dividing a day in three parts, (four or five hours for one client, four or five hours for a second client, and then a handful of hours for business management), I prefer to block off multiple days in a row for each client and task. That looks like this: Monday and Tuesday for one client, Wednesday for another, Thursday for business management, Friday and Saturday for another client (Yes, I work six days a week. It ain’t easy being self employed).

This should facilitate a steady march towards normalcy and healthy systems.

Parting Thoughts

  1. Don’t assume something isn’t important. Assume it’s important and require a lot of evidence to prove that it’s not. Give a serious effort at fixing any problem at the earliest possible moment.
  2. Dedicate contiguous time to completing a task and don’t dare multitask.
  3. Prioritize based on danger and worth.
  4. Chill out and get some exercise. BRO, DO YOU EVEN LIFT?

Not exactly ground breaking advice, but perhaps you need to hear it. I know I do. Preach it. Got any other tips and ideas? Any stories of failure and phoenix-like recovery from ashes? Let me know in the comment below or send me a guest post!

by Wesley David at May 20, 2013 10:44 AM

Chris Siebenmann

Today's comment spammer trick: regurgitated comments

I log the contents of some attempted spam comments here on Wandering Thoughts (the concise summary of when is when the spammer seems to be trying hard). Usually this doesn't get anything, but today my trawl through the logs turned up a succession of bizarre and odd comment attempts. The text had misspellings and typos but it generally made sense and most of the comment attempts were even about technical things that are vaguely on topic for here. But they were invariably attempts to comment on very inapplicable entries.

When I looked at the logs in detail, one of the most striking was a series of comment attempts that looked very much like a conversation between two or more people about using git on home directories. This was very odd since none of the comments were being posted, yet the people were pretty clearly replying to each other; I began to develop all sorts of theories about disturbingly intelligent content auto-generation. Finally I noticed something in one of the comment texts and the penny dropped:

[...] Possibly related posts: (automatically generated)Heroku, the Rails app.

There is a really simple way to get this text into a spam comment: you can be scraping content from existing blog posts and/or blog comments. So my new theory is that the would-be comment spammer is is scraping comment text from other blogs, mangling them somewhat, and then spam-posting them on other blogs (including mine).

The mangled text doesn't seem to have any links or other spam-relevant text so I'm not sure why the spammers are doing this. Maybe they're fishing to see what blogs will allow their comments through moderation and will follow up with more active content on blogs where this works.

Sidebar: source details and other things

So far 30 different IP addresses have tried this here today; most IP addresses have made only one attempt each. The IP addresses cover a large range of source networks. A few of them are CBL listed but that's pretty much it as far as DNBLs are concerned. Four of the IP addresses actually belong to Microsoft (168.63.43.185, 168.63.62.182, 168.63.76.184, and 168.63.84.217; all four are currently listed on the CBL). I'm assuming that these are compromised machines, VPS servers, or both.

Many of the IP addresses also made a burst of GET requests for various other URLs here. Maybe they're scraping text from Wandering Thoughts for use in their corpus for their next spam run somewhere else.

by cks at May 20, 2013 02:45 AM

May 19, 2013

Rands in Repose

Unknowable

Each year, the race to get a ticket for WWDC is on. Even with early warning, the window of ticket availability shrinks with every passing year. 2013 being no different: 2 minutes.

Capping the number of tickets is a classic Apple move: we're going to create a sense of exclusivity by creating an artificial constraint. Moscone Center is huge. Apple could blink and triple the size of the event, but I can't think of the last time the ticket ceiling at WWDC went up. 5000 attendees - that's it.

WWDC is a great event. I've been going for years without a ticket and I still have amazing nights spending time with dear friends debating the state of Apple. Logic would dictate that increasing the number of tickets would increase the "product": the army of foaming-at-the-mouth fanboys'n'girls who, I believe, are one of the best (and cheapest?) organic marketing assets in the industry.

Nope. 5000. That's it.

This type of constraint reeks of Steve Jobs. The rumor at Apple was that Steve capped many of the teams in Cupertino. Mac OS X and Marketing Communications being two successful teams that had their headcount capped. During the 2000s, while Apple was gaining traction across the planet, the team responsible for getting the word out, Marketing Communications ("MarCom"), was allegedly capped at 100 heads. The reasoning I heard was that Steve wanted to keep the teams feeling small, but, more importantly, I think he wanted to keep them knowable.

Of course, with the amount of work they had to produce supporting WWDCs, MacWorlds, product launches, and all the other advertising, they relied on expensive external vendors to do the bulk of the heavy lifting. While back in Cupertino, the 100 represented a small, well-understood group where I believe Steve could not only easily understand every single story being told by Apple, but, more importantly, the 100 could know each other.

When you talk about change or optimum team sizes, Dunbar's number is usually thrown down as scientific evidence of something you already know in your bones. Shit gets weird somewhere between 100 and 200 people. You can no longer keep the individual state of each of the other people in your team or company in your head. Which means communication becomes more taxing. Rather than walking up to Fred and saying, "What's up?" you cautiously walk up to a person you don't know and sheepishly ask, "Yeah... who are you?"

What was easy becomes hard. What used to be maintained in your head now involves an extra email or an additional meeting. What was familiar becomes unfamiliar and frustrating. Culture is diluted, communication becomes taxed, and people start saying, "I remember when..."

Capping the headcount of a team necessary to shaping the story of an increasingly successful company seems counter-intuitive. We're doing well, we should invest more. This type of thinking puts a big discount on the taxes associated with rapid team growth with, in my opinion, being able to easily discern what is going on in a team of people being number one.

Apple's MarCom department being capped at 100 achieved two very different objectives. First, it made the work the team was doing knowable - you could discern who was doing what because there just weren't that many full-time people. This allowed for dictatorial control that has given Apple clear and consistently messaging. Second, the constraint meant that every single person counted. While I never worked on the team, I'm certain they were much quicker in dealing with low performers because you could still discern the difference one additional high performing person would make. While this could certainly be viewed as a constant threat of being fired, it could also make for a high performing team.

The effects of capping WWDC tickets are different because you're talking about a larger population, but some of the effects are the same. Each year, WWDC is held in Moscone West. You know that the big Apple logo will be emblazoned on the side of the building. You know the names of the conference rooms, you know where the snacks will be. But, for me, I know who will be there. I end up in the same bars with the same dear friends and we get foamy at the mouth about Apple because we feel like we know it.

The cap on WWDC tickets means it won't go the way of SXSW - a wildly successful conference that has grown consistently since its inception. I used to go every year until one late night we looked around a huge sea of strangers and decided that we no longer knew this conference. The experience had become diluted. It had become unfamiliar, full of strangers, and unknowable.

May 19, 2013 06:41 PM

witalis

cisco config archive doing

More or less about archiving router configuration is presented on:

http://www.techrepublic.com/blog/networking/use-the-cisco-ios-archive-command-to-archive-your-routers-configuration/532

I would like to add some useful command in this area:

# sh archive config differences nvram:startup-config system:running-config

it’s pretty self explanatory – produces output with differences between startup and running config.

To archive configuration you can add logging about entered commands

archive
 log config
  logging enable
  notify syslog
  hidekeys

commands are logged to syslog without password entries. Besides syslog, you can find which command was entered, using:

# sh archive log config all

Using archive command you can easily rollback your configuration to previous state

# configure replace ftp:<path_to_archive_cfg> list force time 10

it means that your configuration is reverted to archive config and if you don’t confirm in 10 minutes, it will back to running config. To confirm configuration:

# configure confirm

Moreover you can make your changes safer by doing

# configure terminal revert timer 1

it will rollback your configuration in 1 minutes if you don’t confirm configuration. More about this feature in great post:

http://packetpushers.net/cisco-configuration-archive-rollback-using-revert-instead-of-reload/



by admin at May 19, 2013 05:55 PM

Server Density

Chris Siebenmann

The technical effects of being an out of tree Linux kernel module

Suppose that you have a kernel module that is not in the mainstream kernel source for one reason or another. Perhaps it is license compatible but just not integrated for various reasons (as is the case with IET) or perhaps it is license incompatible (as is the case with ZFS on Linux). This non-inclusion has a number of cultural effects, but it also has real technical effects. Although I've mentioned them before, today I want to talk about them in some detail.

The first thing to know is that the Linux kernel does not have a stable kernel API for modules; how a module interacts with the rest of the kernel can and will change without notice. When your module is part of the kernel source, changing it to cope with the API change is generally the responsibility of the kernel developer who wants to make the API change. When your module is not in the kernel tree, not only is changing its code your job but so is even knowing about the API change. And API changes are not always obvious because sometimes they're things like changes in locking requirements or how you are supposed to use existing functions.

(Sometimes they are semi-obvious, like changing just what arguments a function takes. You do pay attention to all warning messages that show up when building your kernel module, right?)

Any number of people would like this to change but it isn't going to. The Linux kernel development process is optimized for in-tree code and not for out of tree code. If your out of tree code cannot be included in the kernel for various reasons, that's tough luck but the kernel developers really don't care that much (as a general rule). Locking themselves down to any stable module API would reduce their ability to improve and evolve the kernel code.

The next effect is pragmatic: if your code is not in the kernel tree, almost no one will look at it (and this includes automated scans over the kernel source code that look for various things) or do things to it. This is great if you're possessive about your code but it means that you're missing out on the quality checking that this creates, all of the little janitorial cleanups that people do, and if there is a bug then your module's developers are the only people who are looking at it.

(In some quarters it's fashionable to think that the Linux kernel developers are all clowns and cannot possibly contribute anything worthwhile to your code. This is a major mistake. Among other things they're basically certain to know the overall Linux kernel environment better than you do.)

A related issue is that the kernel developers try not to create bugs and regressions in in-tree code, especially if it's considered important (which, say, a commonly used filesystem will be); if one is created anyways a bunch of people will go looking to try to fix it. It's almost certain that no official kernel release would go out that broke a significant filesystem; the change that created the breakage would be identified and then reverted, with the change's developer told to try again. If your module is not in the tree, well, you're on your own. Performance regressions or actual breakages are your problem to diagnose and then either fix or try to argue the kernel developers into changing their side of the problem.

(And they may not, especially if your code is license-incompatible with the kernel and most especially if their change actually improves in-tree code and performance and so on.)

All of this means an out of tree kernel module requires more ongoing development work than an in-tree kernel module. In-tree kernel modules generally get somewhat of a ride from general kernel developers; out of tree modules do not and have to make up for it with time from their own developers. One predictable result is that many out of tree modules don't necessarily support all kernel versions, including kernel versions that sysadmins may want to use. A worst case situation with out of tree modules is that the developers simply stop updating the module for new kernels; any users of the module are then orphaned on old kernels.

by cks at May 19, 2013 05:20 AM

May 18, 2013

Ubuntu Geek

Libreoffice 4.0.3 released and PPA installation instructions included

LibreOffice is a comprehensive, professional-quality productivity suite that you can download and install for free. There is a large base of satisfied LibreOffice users worldwide, and it is available in more than 30 languages and for all major operating systems, including Microsoft Windows, Mac OS X and GNU/Linux (Debian, Ubuntu, Fedora, Mandriva, Suse, ...).
(...)
Read the rest of Libreoffice 4.0.3 released and PPA installation instructions included (367 words)


© ruchi for Ubuntu Geek, 2013. | Permalink | 6 comments | Add to del.icio.us
Post tags: , , ,

Related posts

by ruchi at May 18, 2013 11:12 PM

Milek

/sys/admin/blog

Running IT like a business

Some older but really good articles on running IT like a business:

There are a few additions IT Managers should think hard about:

  • Businesses have customers, not users.  We have to be customer focused and look at everything from the customer’s view at service levels, not at each of our component levels.
  • We should treat the allocation of our human resources, where our staff time goes, just like we treat our financial budgets.
  • Our major goals should include improving business-IT communication and creating value for the business.  The more we integrate with the business the better the value we can add.
  • Business models are moving to cloud strategies.  We’re only going to get busier and need to respond quicker to business needs as our product and IT strategies evolve.   Every little bit we do to improve and standardize processes now will pay us back with dividends as our new world evolves.

We’re embarking on a major culture change in IT if we are going to keep pace with the changing business strategy.

by Joe at May 18, 2013 03:08 PM

Chris Siebenmann

A little habit of our documentation: how we write logins

Ove the years, we've developed a number of local conventions for our local documentation. One of them is that we always write Unix logins with < and > around them, as if they were local email addresses, so that we'll talk about how <cks>'s processes had to be terminated or whatever. When I started here this struck me as vaguely goofy; over time it has rather grown on me and I now think it's a quite clever idea.

Writing logins this way does two things. The first is that they become completely unambiguous. This is not much of an issue with a login like 'cks', but we have any number of logins that are (or could be) people's first or last names, and vice versa. Consistently writing the login with <> around it removes that ambiguity and uncertainty. The second thing it does is that it makes it much easier to search for a particular login in old messages and documentation. Searching for 'chris' may get all sorts of hits that are not actually talking about the login chris; searching for '<chris>' narrows that down a lot.

(Well, sort of. The reality is that we sometimes wind up quoting various sorts of system messages and system logs in our messages and of course these messages generally don't use the '<login>' form. However, often excluding these messages from a later search is good enough because we're mostly interested in the record of active things we did to an account.)

There's a corollary to the convenience of <login>: right now we have no similar notation convention for Unix groups. We write less about Unix groups than about Unix logins (and groups generally have more distinct names), but it would still be nice to have some convention so we could do unambiguous searches and so on.

by cks at May 18, 2013 05:13 AM

May 17, 2013

Byron Miller

Devops – It’s about critical thinking & the evolutionary “WHY” of Silos.

I believe one of the best things to ever come out of DevOps movement that no one seems to be describing is essentially an explosion of critical thinking and reasoning skills, The maturing of IT if you will. Some people prescribe different views or distill it into different methods such as C.A.M.S (Culture, Automation, Measure, […]

by byronm at May 17, 2013 01:56 PM

Aaron Johnson

Chris Siebenmann

Why I'm not considering btrfs for our future fileservers just yet

In a comment on yesterday's entry I was asked:

Could you elaborate on the "btrfs does not qualify" part?

What's missing? How likely do you think this to change in the near future?

I will give a simple looking answer that conceals big depths: what's missing is a btrfs webpage that doesn't say 'run the latest kernel.org kernel' and a Fedora release that doesn't say 'btrfs is still experimental and is included as a technology preview' (which is what Fedora 18 says). It's possible that btrfs is more mature and ready than I think it is, but if so the btrfs people are doing a terrible job of publicizing this. Fundamentally I want to be using something that the developers consider 'mature' or at least 'ready' and I don't want us to be among the first pioneers with a production deployment of decent size in a challenging environment.

Pragmatically there is nothing that btrfs can do to make us consider it in the near future, for reasons I wrote about two years ago in an entry on the timing of production btrfs deployments. If btrfs magically became perfect tomorrow, it would only appear in an Ubuntu LTS release in 2014 and an Red Hat Enterprise release in, well, who knows but probably not this year.

(The current Ubuntu 12.04 LTS has btrfs v3.2, whereas btrfs is up to v3.9 already. The btrfs changelog shows the scope of a year's evolution.)

As far as what in specific is missing, well, I have to confess that I haven't looked at the current state of btrfs in much detail and so I don't have specific answers. I poke at btrfs vaguely every so often; generally I discover something that strikes me as alarming and then I go away again. Since btrfs is never going to be exactly like ZFS, I can't just directly translate our our ZFS fileserver design to btrfs and then complain about what's missing or different. To have a really informed opinion on what btrfs needed and what was wrong with it, I'd have to do a btrfs-based fileserver design from scratch, trying to harmonize what we think we want (which has been shaped by what ZFS gives us) with what btrfs gives us. So far there seems to be no real point to doing that before btrfs stabilizes.

(I'm starting to think that btrfs and ZFS have fundamentally different visions about some things, but that needs some more reading and another entry.)

Sidebar: ZFS on Linux maturity versus btrfs maturity

You might ask why I'm willing to consider ZFS on Linux even though it's a relatively young project, just like btrfs. The answer is that the two are fundamentally different. The ZFS part of ZoL on Linux is generally a mature and well proven codebase; most of the uncertain new bits are just for fitting it into Linux.

by cks at May 17, 2013 05:30 AM

The Nubby Admin

Cascading Failure, Technical Debt, and Punching a House with my Face

At 11:32PM, Saturday May 11th, I got an email from MX Toolbox notifying me that a SBS 2008 machine that I support had gone unresponsive. It’s 600 miles away from me in another state. This was not a strange occurrence with this server.

A Cluster of Prior Failures

Five years ago a small office with a minimal budget needed a SBS implementation. I recommended an HP ML 115 G5 with four hard drives and onboard RAID provided by an NVIDIA chipset. I have regretted that decision for all five years. Here’s a post of mine concerning that chipset and the troubles I’ve had with it.

In short, I have poor insight into and control over the entire server’s health. Some examples include:

  • I couldn’t update the hard drives’ firmware, which was a big deal because the serial numbers of those hard drives fell into a set of drives that have a known problem with suddenly going offline. The firmware update has to be applied through HP’s support tools, which are not supported on the ML 110/115. After much research and seeking help from HP, I was told that, in essence, I was left out to dry.
  • The ML 110/115 does not support the ProLiant Support Pack nor does that model support the Insight Control Manager. Keeping drivers updated and staying abreast of the various components’ health was virtually impossible.
  • There was also no HP ILO CLI interface available which made doing things like firmware updates especially difficult remotely.
  • The on-board storage controller had poor support form Nvidia, and offered very slim storage management features or reporting on hard drive health.

For years I hit the management ceiling with that box which probably cost my client more of my time and theirs than had a more robust server been purchased for twice the hardware cost. And then what I had been dreading for years finally happened…

Two Months Ago

“Did you reboot the server?” That’s never a question you want to hear, especially when you did not reboot a server. I VPN’d into that office’s network and checked for the presence of the server on the network. Yes, the server was down. One power cycle later, the OS loaded just fine.

I checked the event logs and it turns out there was a massive flurry of parity errors that came out of nowhere. The server froze as a result. The controller was apparently dying. After a reboot, the data appeared fine, and there were no more parity errors coming from the Nvidia storage driver. I knew something had to be done, but being remote and working with an office that has a shoestring budget (and can often only afford used shoestrings) made the options few and unattractive.

What’s worse, as I started investigating things further, I noticed that the ILO Advanced card that was in the server was no longer showing on the network. Aaaaand the BIOS clock would reset to July 2009 after being shut down (BIOS battery dying) causing strange problems with Active Directory and other applications running on the network that relied on accurate time (read: everything). AAAaaaaaand the two mirror sets (one for the system volume and one for the email server’s databases) had split apart and could not be re-synced because the Nvidia storage management software no longer recognized that any hard drives were connected.

The options, as I saw them, were for the business to either buy a new RAID controller, BIOS battery, and perhaps ILO card (and then scramble to perform the complex surgery remotely on their own, or pay a local consultant to coordinate with me, or pay to ship me on site) or get a new server altogether (and pay a local consultant to coordinate with me, or… you get the idea). Either way, it started to look more and more like a total forklift migration was necessary.

Two Months Later

Yes, it’s been about two months and the server is still riding in the same perilous state. Split mirrors, bi-monthly freezes that require a power cycle to recover from, and a lot of hoping and praying that data is not corrupted. Welcome to the world of supporting small business IT where people re-use tea bags and don’t run heat or AC in order to save money and keep the business open.

That Saturday night, it was getting late and I was thinking about bed. I checked my email one last time for anything pressing when I saw a MX Toolbox alert. This is never good. I scanned the email, saw what host was causing the alert, and knew that I was dead in the water. I could get into that client’s network via both a SonicWall VPN and unattended TeamViewer installations that existed on most of the workstation PCs. However, it was all futile because I didn’t have hardware level access to the server as a result of the ILO’s failure. The office has a Lantronix Spider KVMoIP device that was being used to work on a workstation migration for one employee, and was therefore not hooked up to the main office server. That was two layers of out of band management that was not doing any good for the most important technology asset in the building.

All of this meant that someone would have to show up at the office to power cycle the PC. The technical debt and compound interest of failure had already mounted fairly high by that point, considering the state of the server. However, things were about to get comical.

I’ll Gladly Pay You Tomorrow for Out of Band Management Today

What happened in the next 24 hours was a morbid comedy of oversights and compounded problems that ended in a whiplash inducing facepalm.

First, I needed to email three people who would most likely be in the vicinity of that office so I could coordinate with one of them to drop by on their Sunday morning and power cycle the server. Except the server is what does email for the organization so I can’t send to their organization email addresses (this is a Microsoft SBS machine). I only know of one employee’s non-work address, and I also happen to know the gmail address of another employee’s son.

I email those two people and tell them of the situation. As it turns out, two key workers are out, traveling to a convention in Texas. That makes access to email even more vital than normal. Everyone knows the situation and there’s not much more I can do so I get to bed. It’s not until about 2PM on Sunday, Mother’s Day here in the USA, that I hear back from one worker who has just enough time to skip by the office and power cycle the server.

Myself, I’m in the midst of a Mother’s Day dinner with my own family so I had ditched my phone… just moments before the employee called me from the remote office. I missed the call and the employee left a voicemail expressing a state of confusion over which server to power cycle. The organization is small and only has two servers. One is the SBS machine and the other is a HP MicroServer that is used as a network monitoring station and catchall for various extraneous services. I had assumed that over the years everyone had each server’s role understood by sight so I simply asked him to power cycle the SBS server, expecting that it would be known which piece of hardware that was. The fellow power cycled both servers since he couldn’t get in touch with me directly.

Okay, no big deal. The MicroServer is just running CentOS and OpenNMS. They’re resilient and can handle a sudden shutdown. As I listened to that voicemail, I checked to see if I could remotely connect to the server that had been down all night. I couldn’t. Great. Time to call the office and talk to the person who was on site and see what else could be done. Except the voicemail had been left over an hour ago and the employee had naturally left shortly after power cycling the server. I called his cell phone back, but he’s didn’t pick up. I left a voicemail.

A little later that Sunday I get in touch with another employee in the area who lives closer. He’s on his way out to pick up Mother’s Day dinner for his wife and can swing by to check out the server. First, I have him power cycle it again. Maybe the first guy just clicked the power button and didn’t hold it in? I held out hope for such a simple explanation. However, after I instructed this second person on how to make sure the server had shut down and then powered up, I waited for the duration of the standard bootup but nothing was showing up. It became apparent that the server was not coming back online.

“Do you know where the Spider is?” I asked hopefully. “No, I dunno where the other guy put it.” Gah! The Spider is a well known piece of equipment in that office, and it’s very rare that it can’t be found. I was about to concede defeat for that Sunday when, after some searching, the employee found the Spider. A few minutes of scrambling around and he had the thing hooked up to the server. Except… now I couldn’t get to the Spider. The fellow had to leave to pick up dinner and I wasn’t about to ruin his family’s Mother’s Day so I told him I’d see what I could do remotely, expecting nothing to be successful.

In the process of hooking up the Lantronix Spider, the employee had pulled the network cable out of the server and put it into the Spider. Then from the spider’s cascade port (it’s essentially a one port switch) he had connected a patch cable to the server’s LAN port. That made me wonder… perhaps it was a port on the ProCurve switch that was bad? That would explain both the server and now the Lantronix Spider being inaccessible. Or maybe the port spontaneously shut down as a result of some bug. Crazier things have happened.

I browsed to the switch’s management interface. “Please enter your username and password!” Okay, no problem! “Wait… I can’t remember what the password is… NOOOOOO!” The organization uses KeePass to store important passwords and software keys. The KeePass file is on the server. The server that is down.

But wait! I have a copy of the keepass databases on my own storage. Once a month or so I copy the files to my local storage so that I have an in-sync copy just in case. Whew! I find the switch’s login credentials and begin inspecting things. I looked, hoping for some bad news concerning the switch’s health (at least that would mean the server was okay), but the switch looked perfect. Nothing was amiss.

I’ve always been told to troubleshoot network problems from the lowest layer first. I had pretty much ruled out the physical layer. Layer 2 seemed healthy. Not much that can go wrong on a small, single subnet LAN. Layer three, IP… IP addresses… I gritted my teeth. I knew what the problem was. The Lantronix Spider is set to pick up an address via DHCP. Specifically it’s a DHCP reservation on the network’s DHCP server. The server that’s down. I wanted the network layer benefits of a static IP address, however I also wanted it to be easily portable between networks. My original idea was that the Spider could be used to support PCs on other LANs, like perhaps workers that were based in their home office that didn’t come into the organization’s building very often. With the Spider getting an IP address via DHCP, I could just tell someone to take it home with them and I’d only be left with walking them through configuring port forwarding, or getting TeamViewer set up on a PC on their LAN so I could get in and access the Spider via a local web browser. Except now the Spider was barking out forlorn DHCP discover packets and not getting any response back.

I fired up Network Monitor on an office PC to be sure. Yep, there it was. A DHCP discover request broadcasting every sixty seconds or so. Okay, I can handle this. The small office has a SonicWall firewall that has DHCP services on it. I only need to enable them, check its list of leases to find what IP address it was given, and I’ll be good! I mosey my web browser on over to the firewall’s administrative page. I stare at it. It wants the password for the admin user. “Password… password… I had to change it a few weeks ago. What did I choose…”

Oh well, I’ll look in the organization’s copied password file that I keep on my local storage! Yay foresight!! I found the firewall admin password and entered it. “Password Failure. Please Retry.” What?! Then I remembered that I had changed the firewall password due to security policy about two weeks ago. However, I hadn’t copied the organization’s password file to my local storage in a month. I had the old password in my copy of the password file, but not the new one. The new one was on the server that was currently down. Backups are taken every few hours, but a restoration needs to be done on functioning hardware. Super.

So that means I did it again. I couldn’t log in to the interface because I didn’t have the long password committed to memory. For super important passwords like that, I do keep a disaster recovery hard copy around. It’s essentially a few pages spelling out the most important usernames and password for the organization. However, only two people have that physical copy of information. While I could call them up and have them read off the password to me, I wasn’t ready to do that.

Instead, I turned to the HP MicroServer running CentOS 6. I have OpenNMS installed on it and have plans to install some ticketing software and maybe smokeping or M/Monit. Now, however, it’s going to be an impromptu DHCP server. Fortunately I can remember the password for the MicroServer! A quick ‘yum install dhcpd’ later and… “Couldn’t resolve host ‘centos-distro.cavecreek.net’” WHAT DEVILRY IS THIS?! But of course; DNS for the network is performed by the SBS server… which is down. After facepalming, I changed resolv.conf to point to OpenDNS and continued my march towards a functioning DHCP server on the network. After a few minutes I have dhcpd running and it quickly hands out a lease to the Spider.

And it was then that I saw it. After logging into the Spider, I viewed the remote console and saw a Windows installation screen on the server. Suddenly, I remembered what happened. In the process of preparing for a migration away from the failing hardware, I needed to experiment with making an unattended installation file. I had a remote worker put the SBS 2008 install CD in the main server’s tray. Of course, rebooting caused the server to boot into the high boot priority CD drive. I sat in horror, thinking about my cascade of failures. Nevertheless, that wasn’t the time to flail in self loathing. I simply needed to hit “cancel” and get out of the installation welcome screen to boot from the hard drive.

Except the Spider was unable to interact with the server as a remote keyboard or mouse. I’ve used the Spider on that very server in the past, and it worked great at all stages of the boot process. In the years that I’ve worked with that office I’ve had to check BIOS settings, ILO firmware settings, and storage controller settings, all using either the Spider or the ILO itseld. But now, for some unexplained reason, the Spider was not able to input anything. I couldn’t move the mouse, I couldn’t press keys. So I sat and stared at the remote video in complete disbelief.

It was a simple matter of leaving a voicemail for someone and telling them to remove the disc from the DVD drive the next time they were in the office. The next morning the worker that I left a message for did just that, power cycled the server, and it booted up as normal. Life continued.

I was abashed.

More about my conclusions concerning the situation later. In the mean time, got a similar story to share? Let me know in the comments below or contact me and you can write a guest blog post about it.

by Wesley David at May 17, 2013 03:38 AM

May 16, 2013

Ubuntu Geek

How to Install Cinnamon 1.8 on ubuntu 13.04

Cinnamon is a user interface. It is a fork of GNOME Shell, initially developed by (and for) Linux Mint. It attempts to provide a more traditional user environment based on the desktop metaphor, like GNOME 2. Cinnamon uses Muffin, a fork of the GNOME 3 window manager Mutter, as its window manager from Cinnamon 1.2 onwards
(...)
Read the rest of How to Install Cinnamon 1.8 on ubuntu 13.04 (249 words)


© ruchi for Ubuntu Geek, 2013. | Permalink | 4 comments | Add to del.icio.us
Post tags: , ,

Related posts

by ruchi at May 16, 2013 11:54 PM


Administered by Joe. Content copyright by their respective authors.