Linux Journal

More L337 Translations

10 hours 44 minutes ago
More L337 Translations Image Dave Taylor Thu, 04/19/2018 - 09:20 HOW-TOs Programming Shell Scripting

Dave continues with his shell-script L33t translator.

In my last article, I talked about the inside jargon of hackers and computer geeks known as "Leet Speak" or just "Leet". Of course, that's a shortened version of the word Elite, and it's best written as L33T or perhaps L337 to be ultimately kewl. But hey, I don't judge.

Last time I looked at a series of simple letter substitutions that allow you to convert a sentence like "I am a master hacker with great skills" into something like this:

I AM A M@ST3R H@XR WITH GR3@T SKILLZ

It turns out that I missed some nuances of Leet and didn't realize that most often the letter "a" is actually turned into a "4", not an "@", although as with just about everything about the jargon, it's somewhat random.

In fact, every single letter of the alphabet can be randomly tweaked and changed, sometimes from a single letter to a sequence of two or three symbols. For example, another variation on "a" is "/-\" (for what are hopefully visually obvious reasons).

Continuing in that vein, "B" can become "|3", "C" can become "[", "I" can become "1", and one of my favorites, "M" can change into "[]V[]". That's a lot of work, but since one of the goals is to have a language no one else understands, I get it.

There are additional substitutions: a word can have its trailing "S" replaced by a "Z", a trailing "ED" can become "'D" or just "D", and another interesting one is that words containing "and", "anned" or "ant" can have that sequence replaced by an ampersand (&).

Let's add all these L337 filters and see how the script is shaping up.

But First, Some Randomness

Since many of these transformations are going to have a random element, let's go ahead and produce a random number between 1–10 to figure out whether to do one or another action. That's easily done with the $RANDOM variable:

doit=$(( $RANDOM % 10 )) # random virtual coin flip

Now let's say that there's a 50% chance that a -ed suffix is going to change to "'D" and a 50% chance that it's just going to become "D", which is coded like this:

if [ $doit -ge 5 ] ; then word="$(echo $word | sed "s/ed$/d/")" else word="$(echo $word | sed "s/ed$/'d/")" fi

Let's add the additional transformations, but not do them every time. Let's give them a 70–90% chance of occurring, based on the transform itself. Here are a few examples:

if [ $doit -ge 3 ] ; then word="$(echo $word | sed "s/cks/x/g;s/cke/x/g")" fi if [ $doit -ge 4 ] ; then word="$(echo $word | sed "s/and/\&/g;s/anned/\&/g; s/ant/\&/g")" fi

And so, here's the second translation, a bit more sophisticated:

$ l33t.sh "banned? whatever. elite hacker, not scriptie." B&? WH4T3V3R. 3LIT3 H4XR, N0T SCRIPTI3.

Note that it hasn't realized that "elite" should become L337 or L33T, but since it is supposed to be rather random, let's just leave this script as is. Kk? Kewl.

If you want to expand it, an interesting programming problem is to break each word down into individual letters, then randomly change lowercase to uppercase or vice versa, so you get those great ransom-note-style WeiRD LeTtEr pHrASes.

Next time, I plan to move on, however, and look at the great command-line tool youtube-dl, exploring how to use it to download videos and even just the audio tracks as MP3 files.

Dave Taylor

Help Canonical Test GNOME Patches, Android Apps Illegally Tracking Kids, MySQL 8.0 Released and More

10 hours 58 minutes ago
GNOME Desktop News Android Security Privacy MySQL KDE LibreOffice Cloud

News briefs for April 19, 2018.

Help Canonical test the GNOME desktop memory leak fixes in Ubuntu 18.04 LTS (Bionic Beaver) by downloading and installing the current daily ISO for your hardware from here: http://cdimage.ubuntu.com/daily-live/current/bionic-desktop-amd64.iso. Then download the patched version of gjs, install, reboot, and then just use your desktop normally. If performance seems impacted by the new packages, re-install from the ISO again, but don't install the new packages and see if things are better. See the Ubuntu Community page for more detailed instructions.

Thousands of Android apps downloaded from the Google Play store may be tracking kids' data illegally, according to a new study. NBC News reports: "Researchers at the University of California's International Computer Science Institute analyzed 5,855 of the most downloaded kids apps, concluding that most of them are 'are potentially in violation' of the Children's Online Privacy Protection Act 1998, or COPPA, a federal law making it illegal to collect personally identifiable data on children under 13."

MySQL 8.0 has been released. This new version "includes significant performance, security and developer productivity improvements enabling the next generation of web, mobile, embedded and Cloud applications." MySQL 8.0 features include MySQL document store, transactional data dictionary, SQL roles, default to utf8mb4 and more. See the white paper for all the details.

KDE announced this morning that KDE Applications 18.04.0 are now available. New features include improvements to panels in the Dolphin file manager; Wayland support for KDE's JuK music player; improvements to Gwenview, KDE's image viewer and organizer; and more.

Collabora Productivity, "the driving force behind putting LibreOffice in the cloud", announced a new release of its enterprise-ready cloud document suite—Collabora Online 3.2. The new release includes implemented chart creation, data validation in Calc, context menu spell-checking and more.

Jill Franklin

An Update on Linux Journal

1 day 7 hours ago
An Update on Linux Journal Image Carlie Fairchild Wed, 04/18/2018 - 12:41 Linux Journal Subscribe Patron Advertise Write

So many of you have asked how to help Linux Journal continue to be published* for years to come.

First, keep the great ideas coming—we all want to continue making Linux Journal 2.0 something special, and we need this community to do it.

Second, subscribe or renew. Magazines have a built-in fundraising program: subscriptions. It's true that most magazines don't survive on subscription revenue alone, but having a strong subscriber base tells Linux Journal, prospective authors, and yes, advertisers, that there is a community of people who support and read the magazine each month.

Third, if you prefer reading articles on our website, consider becoming a Patron. We have different Patreon reward levels, one even gets your name immortalized in the pages of Linux Journal.

Fourth, spread the word within your company about corporate sponsorship of Linux Journal. We as a community reject tracking, but we explicitly invite high-value advertising that sponsors the magazine and values readers. This is new and unique in online publishing, and just one example of our pioneering work here at Linux Journal.  

Finally, write for us! We are always looking for new writers, especially now that we are publishing more articles more often.  

With all our gratitude,

Your friends at Linux Journal

 

*We'd be remiss to not acknowledge or thank Private Internet Access for saving the day and bringing Linux Journal back from the dead. They are incredibly supportive partners and sincerely, we can not thank them enough for keeping us going. At a certain point however, Linux Journal has to become sustainable on its own.

Carlie Fairchild

Rise of the Tomb Raider Comes to Linux Tomorrow, IoT Developers Survey, New Zulip Release and More

1 day 9 hours ago
News gaming Chrome GIMP IOT openSUSE Distributions Desktop

News briefs for April 18, 2018.

Rise of the Tomb Raider: 20 Year Celebration comes to Linux tomorrow! A minisite dedicated to Rise of the Tomb Raider is available now from Feral Interactive, and you also can view the trailer on Feral's YouTube channel.

Zulip 1.8, the open-source team chat software, announces the release of Zulip Server 1.8. This is a huge release, with more than 3500 new commits since the last release in October 2017. Zulip "is an alternative to Slack, HipChat, and IRC. Zulip combines the immediacy of chat with the asynchronous efficiency of email-style threading, and is 100% free and open-source software".

The IoT Developers Survey 2018 is now available. The survey was sponsored by the Eclipse IoT Working Group, Agile IoT, IEEE and the Open Mobile Alliance "to better understand how developers are building IoT solutions". The survey covers what people are building, key IoT concerns, top IoT programming languages and distros, and more.

Google released Chrome 66 to its stable channel for desktop/mobile users. This release includes many security improvements as well as new JavaScript APIs. See the Chrome Platform Status site for details.

openSUSE Leap 15 is scheduled for release May 25, 2018. Leap 15 "shares a common core with SUSE Linux Enterprise (SLE) 15 sources and has thousands of community packages on top to meet the needs of professional and semi-professional users and their workloads."

GIMP 2.10.0 RC 2 has been released. This release fixes 44 bugs and introduces important performance improvements. See the complete list of changes here.

Jill Franklin

Create Dynamic Wallpaper with a Bash Script

1 day 10 hours ago
Create Dynamic Wallpaper with a Bash Script Image Patrick Wheelan Wed, 04/18/2018 - 09:58 bash Desktop Programming

Harness the power of bash and learn how to scrape websites for exciting new images every morning.

So, you want a cool dynamic desktop wallpaper without dodgy programs and a million viruses? The good news is, this is Linux, and anything is possible. I started this project because I was bored of my standard OS desktop wallpaper, and I have slowly created a plethora of scripts to pull images from several sites and set them as my desktop background. It's a nice little addition to my day—being greeted by a different cat picture or a panorama of a country I didn't know existed. The great news is that it's easy to do, so let's get started.

Why Bash?

BAsh (The Bourne Again shell) is standard across almost all *NIX systems and provides a wide range of operations "out of the box", which would take time and copious lines of code to achieve in a conventional coding or even scripting language. Additionally, there's no need to re-invent the wheel. It's much easier to use somebody else's program to download webpages for example, than to deal with low-level system sockets in C.

How's It Going to Work?

The concept is simple. Choose a site with images you like and "scrape" the page for those images. Then once you have a direct link, you download them and set them as the desktop wallpaper using the display manager. Easy right?

A Simple Example: xkcd

To start off, let's venture to every programmer's second-favorite page after Stack Overflow: xkcd. Loading the page, you should be greeted by the daily comic strip and some other data.

Now, what if you want to see this comic without venturing to the xkcd site? You need a script to do it for you. First, you need to know how the webpage looks to the computer, so download it and take a look. To do this, use wget, an easy-to-use, commonly installed, non-interactive, network downloader. So, on the command line, call wget, and give it the link to the page:

user@LJ $: wget https://www.xkcd.com/ --2018-01-27 21:01:39-- https://www.xkcd.com/ Resolving www.xkcd.com... 151.101.0.67, 151.101.192.67, ↪151.101.64.67, ... Connecting to www.xkcd.com|151.101.0.67|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2606 (2.5K) [text/html] Saving to: 'index.html' index.html 100% [==========================================================>] 2.54K --.-KB/s in 0s 2018-01-27 21:01:39 (23.1 MB/s) - 'index.html' saved [6237]

As you can see in the output, the page has been saved to index.html in your current directory. Using your favourite editor, open it and take a look (I'm using nano for this example):

user@LJ $: nano index.html

Now you might realize, despite this being a rather bare page, there's a lot of code in that file. Instead of going through it all, let's use grep, which is perfect for this task. Its sole function is to print lines matching your search. Grep uses the syntax:

user@LJ $: grep [search] [file]

Looking at the daily comic, its current title is "Night Sky". Searching for "night" with grep yields the following results:

user@LJ $: grep "night" index.html Image URL (for hotlinking/embedding): ↪https://imgs.xkcd.com/comics/night_sky.png

The grep search has returned two image links in the file, each related to "night". Looking at those two lines, one is the image in the page, and the other is for hotlinking and is already a usable link. You'll be obtaining the first link, however, as it is more representative of other pages that don't provide an easy link, and it serves as a good introduction to the use of grep and cut.

To get the first link out of the page, you first need to identify it in the file programmatically. Let's try grep again, but this time instead of using a string you already know ("night"), let's approach as if you know nothing about the page. Although the link will be different, the HTML should remain the same; therefore, always should appear before the link you want:

user@LJ $: grep "img src=" index.html

It looks like there are three images on the page. Comparing these results from the first grep, you'll see that grep. The other two links contain "/s/"; whereas the link we want contains "/comics/". So, you need to grep the output of the last command for "/comics/". To pass along the output of the last command, use the pipe character (|):

user@LJ $: grep "img src=" index.html | grep "/comics/"

And, there's the line! Now you just need to separate the image link from the rest of it with the cut command. cut uses the syntax:

user@LJ $: cut [-d delimeter] [-f field] [-c characters]

To cut the link from the rest of the line, you'll want to cut next to the quotation mark and select the field before the next quotation mark. In other words, you want the text between the quotes, or the link, which is done like this:

user@LJ $: grep "img src=" index.html | grep "/comics/" | ↪cut -d\" -f2 //imgs.xkcd.com/comics/night_sky.png

And, you've got the link. But wait! What about those pesky forward slashes at the beginning? You can cut those out too:

user@LJ $: grep "img src=" index.html | grep "/comics/" | ↪cut -d\" -f 2 | cut -c 3- imgs.xkcd.com/comics/night_sky.png

Now you've just cut the first three characters from the line, and you're left with a link straight to the image. Using wget again, you can download the image:

user@LJ $: wget imgs.xkcd.com/comics/night_sky.png --2018-01-27 21:42:33-- http://imgs.xkcd.com/comics/night_sky.png Resolving imgs.xkcd.com... 151.101.16.67, 2a04:4e42:4::67 Connecting to imgs.xkcd.com|151.101.16.67|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 54636 (53K) [image/png] Saving to: 'night_sky.png' night_sky.png 100% [===========================================================>] 53.36K --.-KB/s in 0.04s 2018-01-27 21:42:33 (1.24 MB/s) - 'night_sky.png' ↪saved [54636/54636]

Now you have the image in your directory, but its name will change when the comic's name changes. To fix that, tell wget to save it with a specific name:

user@LJ $: wget "$(grep "img src=" index.html | grep "/comics/" ↪| cut -d\" -f2 | cut -c 3-)" -O wallpaper --2018-01-27 21:45:08-- http://imgs.xkcd.com/comics/night_sky.png Resolving imgs.xkcd.com... 151.101.16.67, 2a04:4e42:4::67 Connecting to imgs.xkcd.com|151.101.16.67|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 54636 (53K) [image/png] Saving to: 'wallpaper' wallpaper 100% [==========================================================>] 53.36K --.-KB/s in 0.04s 2018-01-27 21:45:08 (1.41 MB/s) - 'wallpaper' saved [54636/54636]

The -O option means that the downloaded image now has been saved as "wallpaper". Now that you know the name of the image, you can set it as a wallpaper. This varies depending upon which display manager you're using. The most popular are listed below, assuming the image is located at /home/user/wallpaper.

GNOME:

gsettings set org.gnome.desktop.background picture-uri ↪"File:///home/user/wallpaper" gsettings set org.gnome.desktop.background picture-options ↪scaled

Cinnamon:

gsettings set org.cinnamon.desktop.background picture-uri ↪"file:///home/user/wallpaper" gsettings set org.cinnamon.desktop.background picture-options ↪scaled

Xfce:

xfconf-query --channel xfce4-desktop --property ↪/backdrop/screen0/monitor0/image-path --set ↪/home/user/wallpaper

You can set your wallpaper now, but you need different images to mix in. Looking at the webpage, there's a "random" button that takes you to a random comic. Searching with grep for "random" returns the following:

user@LJ $: grep random index.html
  • Random
  • Random
  • This is the link to a random comic, and downloading it with wget and reading the result, it looks like the initial comic page. Success!

    Now that you've got all the components, let's put them together into a script, replacing www.xkcd.com with the new c.xkcd.com/random/comic/:

    #!/bin/bash wget c.xkcd.com/random/comic/ wget "$(grep "img src=" index.html | grep /comics/ | cut -d\" ↪-f 2 | cut -c 3-)" -O wallpaper gsettings set org.gnome.desktop.background picture-uri ↪"File:///home/user/wallpaper" gsettings set org.gnome.desktop.background picture-options ↪scaled

    All of this should be familiar except the first line, which designates this as a bash script, and the second wget command. To capture the output of commands into a variable, you use $(). In this case, you're capturing the grepping and cutting process—capturing the final link and then downloading it with wget. When the script is run, the commands inside the bracket are all run producing the image link before wget is called to download it.

    There you have it—a simple example of a dynamic wallpaper that you can run anytime you want.

    If you want the script to run automatically, you can add a cron job to have cron run it for you. So, edit your cron tab with:

    user@LJ $: crontab -e

    My script is called "xkcd", and my crontab entry looks like this:

    @reboot /bin/bash /home/user/xkcd

    This will run the script (located at /home/user/xkcd) using bash, every restart.

    Reddit

    The script above shows how to search for images in HTML code and download them. But, you can apply this to any website of your choice—although the HTML code will be different, the underlying concepts remain the same. With that in mind, let's tackle downloading images from Reddit. Why Reddit? Reddit is possibly the largest blog on the internet and the third-most-popular site in the US. It aggregates content from many different communities together onto one site. It does this through use of "subreddits", communities that join together to form Reddit. For the purposes of this article, let's focus on subreddits (or "subs" for short) that primarily deal with images. However, any subreddit, as long as it allows images, can be used in this script.

    Figure 1. Scraping the Web Made Simple—Analysing Web Pages in a Terminal

    Diving In

    Just like the xkcd script, you need to download the web page from a subreddit to analyse it. I'm using reddit.com/r/wallpapers for this example. First, check for images in the HTML:

    user@LJ $: wget https://www.reddit.com/r/wallpapers/ && grep ↪"img src=" index.html --2018-01-28 20:13:39-- https://www.reddit.com/r/wallpapers/ Resolving www.reddit.com... 151.101.17.140 Connecting to www.reddit.com|151.101.17.140|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 27324 (27K) [text/html] Saving to: 'index.html' index.html 100% [==========================================================>] 26.68K --.-KB/s in 0.1s 2018-01-28 20:13:40 (270 KB/s) - 'index.html' saved [169355] a community ↪for 9 years ↪....Forever and ever...... --- SNIP ---

    All the images have been returned in one long line, because the HTML for the images is also in one long line. You need to split this one long line into the separate image links. Enter Regex.

    Regex is short for regular expression, a system used by many programs to allow users to match an expression to a string. It contains wild cards, which are special characters that match certain characters. For example, the * character will match every character. For this example, you want an expression that matches every link in the HTML file. All HTML links have one string in common. They all take the form href="LINK". Let's write a regex expression to match:

    href="([^"#]+)"

    Now let's break it down:

    • href=" — simply states that the first characters should match these.

    • () — forms a capture group.

    • [^] — forms a negated set. The string shouldn't match any of the characters inside.

    • + — the string should match one or more of the preceding tokens.

    Altogether the regex matches a string that begins href=", doesn't contain any quotation marks or hashtags and finishes with a quotation mark.

    This regex can be used with grep like this:

    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html href="/static/opensearch.xml" href="https://www.reddit.com/r/wallpapers/" href="//out.reddit.com" href="//out.reddit.com" href="//www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-57x57.png" --- SNIP ---

    The -e options allow for extended regex options, and the -o switch means grep will print only patterns exactly matching and not the whole line. You now have a much more manageable list of links. From there, you can use the same techniques from the first script to extract the links and filter for images. This looks like the following:

    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html | cut -d'"' ↪-f2 | sort | uniq | grep -E '.jpg|.png' https://i.imgur.com/6DO2uqT.png https://i.imgur.com/Ualn765.png https://i.imgur.com/UO5ck0M.jpg https://i.redd.it/s8ngtz6xtnc01.jpg //www.redditstatic.com/desktop2x/img/favicon/ ↪android-icon-192x192.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-114x114.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-120x120.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-144x144.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-152x152.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-180x180.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-57x57.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-60x60.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-72x72.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-76x76.png //www.redditstatic.com/desktop2x/img/favicon/ ↪favicon-16x16.png //www.redditstatic.com/desktop2x/img/favicon/ ↪favicon-32x32.png //www.redditstatic.com/desktop2x/img/favicon/ ↪favicon-96x96.png

    The final grep uses regex again to match .jpg or .png. The | character acts as a boolean OR operator.

    As you can see, there are four matches for actual images: two .jpgs and two .pngs. The others are Reddit default images, like the logo. Once you remove those images, you'll have a final list of images to set as a wallpaper. The easiest way to remove these images from the list is with sed:

    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html | cut -d'"' ↪-f2 | sort | uniq | grep -E '.jpg|.png' | sed /redditstatic/d https://i.imgur.com/6DO2uqT.png https://i.imgur.com/Ualn765.png https://i.imgur.com/UO5ck0M.jpg https://i.redd.it/s8ngtz6xtnc01.jpg

    sed works by matching what's between the two forward slashes. The d on the end tells sed to delete the lines that match the pattern, leaving the image links.

    The great thing about sourcing images from Reddit is that every subreddit contains nearly identical HTML; therefore, this small script will work on any subreddit.

    Creating a Script

    To create a script for Reddit, it should be possible to choose from which subreddits you'd like to source images. I've created a directory for my script and placed a file called "links" in the directory with it. This file contains the subreddit links in the following format:

    https://www.reddit.com/r/wallpapers https://www.reddit.com/r/wallpaper https://www.reddit.com/r/NationalPark https://www.reddit.com/r/tiltshift https://www.reddit.com/r/pic

    At run time, I have the script read the list and download these subreddits before stripping images from them.

    Since you can have only one image at a time as desktop wallpaper, you'll want to narrow down the selection of images to just one. First, however, it's best to have a wide range of images without using a lot of bandwidth. So you'll want to download the web pages for multiple subreddits and strip the image links but not download the images themselves. Then you'll use a random selector to select one image link and download that one to use as a wallpaper.

    Finally, if you're downloading lots of subreddit's web pages, the script will become very slow. This is because the script waits for each command to complete before proceeding. To circumvent this, you can fork a command by appending an ampersand (&) character. This creates a new process for the command, "forking" it from the main process (the script).

    Here's my fully annotated script:

    #!/bin/bash DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" ↪# Get the script's current directory linksFile="links" mkdir $DIR/downloads cd $DIR/downloads # Strip the image links from the html function parse { grep -o -E 'href="([^"#]+)"' $1 | cut -d'"' -f2 | sort | uniq ↪| grep -E '.jpg|.png' >> temp grep -o -E 'href="([^"#]+)"' $2 | cut -d'"' -f2 | sort | uniq ↪| grep -E '.jpg|.png' >> temp grep -o -E 'href="([^"#]+)"' $3 | cut -d'"' -f2 | sort | uniq ↪| grep -E '.jpg|.png' >> temp grep -o -E 'href="([^"#]+)"' $4 | cut -d'"' -f2 | sort | uniq ↪| grep -E '.jpg|.png' >> temp } # Download the subreddit's webpages function download { rname=$( echo $1 | cut -d / -f 5 ) tname=$(echo t.$rname) rrname=$(echo r.$rname) cname=$(echo c.$rname) wget --load-cookies=../cookies.txt -O $rname $1 ↪&>/dev/null & wget --load-cookies=../cookies.txt -O $tname $1/top ↪&>/dev/null & wget --load-cookies=../cookies.txt -O $rrname $1/rising ↪&>/dev/null & wget --load-cookies=../cookies.txt -O $cname $1/controversial ↪&>/dev/null & wait # wait for all forked wget processes to return parse $rname $tname $rrname $cname } # For each line in links file while read l; do if [[ $l != *"#"* ]]; then # if line doesn't contain a ↪hashtag (comment) download $l& fi done < ../$linksFile wait # wait for all forked processes to return sed -i '/www.redditstatic.com/d' temp # remove reddit pics that ↪exist on most pages from the list wallpaper=$(shuf -n 1 temp) # select randomly from file and DL echo $wallpaper >> $DIR/log # save image into log in case ↪we want it later wget -b $wallpaper -O $DIR/wallpaperpic 1>/dev/null # Download ↪wallpaper image gsettings set org.gnome.desktop.background picture-uri ↪file://$DIR/wallpaperpic # Set wallpaper (Gnome only!) rm -r $DIR/downloads # cleanup

    Just like before, you can set up a cron job to run the script for you at every reboot or whatever interval you like.

    And, there you have it—a fully functional cat-image harvester. May your morning logins be greeted with many furry faces. Now go forth and discover new subreddits to gawk at and new websites to scrape for cool wallpapers.

    Patrick Wheelan

    Cooking With Linux (without a net): A CMS Smorgasbord

    2 days 5 hours ago

    Please support Linux Journal by subscribing or becoming a patron.

    Note : You are watching a recording of a live show. It's Tuesday and that means it's time for Cooking With Linux (without a net), sponsored and supported by Linux Journal. Today, I'm going to install four popular content management systems. These will be Drupal, Joomla, Wordpress, and Backdrop. If you're trying to decide on what your next CMS platform should be, this would be a great time to tune in. And yes, I'll do it all live, without a net, and with a high probability of falling flat on my face. Join me today, at 12 noon, Easter Time. Be part of the conversation.

    Content management systems covered include:

    Drupal WordPress CMS Web Development
    Marcel Gagné

    The Agony and the Ecstasy of Cloud Billing

    2 days 10 hours ago
    The Agony and the Ecstasy of Cloud Billing Image Corey Quinn Tue, 04/17/2018 - 09:40 AWS Cloud

    Cloud billing is inherently complex; it's not just you.

    Back in the mists of antiquity when I started reading Linux Journal, figuring out what an infrastructure was going to cost was (although still obnoxious in some ways) straightforward. You'd sign leases with colocation providers, buy hardware that you'd depreciate on a schedule and strike a deal in blood with a bandwidth provider, and you were more or less set until something significant happened to your scale.

    In today's brave new cloud world, all of that goes out the window. The public cloud providers give with one hand ("Have a full copy of any environment you want, paid by the hour!"), while taking with the other ("A single Linux instance will cost you $X per hour, $Y per GB transferred per month, and $Z for the attached storage; we simplify this pricing into what we like to call 'We Make It Up As We Go Along'").

    In my day job, I'm a consultant who focuses purely on analyzing and reducing the Amazon Web Services (AWS) bill. As a result, I've seen a lot of environments doing different things: cloud-native shops spinning things up without governance, large enterprises transitioning into the public cloud with legacy applications that don't exactly support that model without some serious tweaking, and cloud migration projects that somehow lost their way severely enough that they were declared acceptable as they were, and the "multi-cloud" label was slapped on to them. Throughout all of this, some themes definitely have emerged that I find that people don't intuitively grasp at first. To wit:

    • It's relatively straightforward to do the basic arithmetic to figure out what a current data center would cost to put into the cloud as is—generally it's a lot! If you do a 1:1 mapping of your existing data center into the cloudy equivalents, it invariably will cost more; that's a given. The real cost savings arise when you start to take advantage of cloud capabilities—your web server farm doesn't need to have 50 instances at all times. If that's your burst load, maybe you can scale that in when traffic is low to five instances or so? Only once you fall into a pattern (and your applications support it!) of paying only for what you need when you need it do the cost savings of cloud become apparent.

    • One of the most misunderstood aspects of Cloud Economics is the proper calculation of Total Cost of Ownership, or TCO. If you want to do a break-even analysis on whether it makes sense to build out a storage system instead of using S3, you've got to include a lot more than just a pile of disks. You've got to factor in disaster recovery equipment and location, software to handle replication of data, staff to run the data center/replace drives, the bandwidth to get to the storage from where it's needed, the capacity planning for future growth—and the opportunity cost of building that out instead of focusing on product features.

    • It's easy to get lost in the byzantine world of cloud billing dimensions and lose sight of the fact that you've got staffing expenses. I've yet to see a company with more than five employees wherein the cloud expense wasn't dwarfed by payroll. Unlike the toy projects some of us do as labors of love, engineering time costs a lot of money. Retraining existing staff to embrace a cloud future takes time, and not everyone takes to this new paradigm quickly.

    • Accounting is going to have to weigh in on this, and if you're not prepared for that conversation, it's likely to be unpleasant. You're going from an old world where you could plan your computing expenses a few years out and be pretty close to accurate. Cloud replaces that with a host of variables to account for, including variable costs depending upon load, amortization of Reserved Instances, provider price cuts and a complete lack of transparency with regard to where the money is actually going (Dev or Prod? Which product? Which team spun that up? An engineer left the company six months ago, but their 500TB of data is still sitting there and so on).

    The worst part is that all of this isn't apparent to newcomers to cloud billing, so when you trip over these edge cases, it's natural to feel as if the problem is somehow your fault. I do this for a living, and I was stymied trying to figure out what data transfer was likely to cost in AWS. I started drawing out how it's billed to customers, and ultimately came up with the "AWS Data Transfer Costs" diagram shown in Figure 1.

    Figure 1. A convoluted mapping of how AWS data transfer is priced out.

    If you can memorize those figures, you're better at this than I am by a landslide! It isn't straightforward, it's not simple, and it's certainly not your fault if you don't somehow intrinsically know these things.

    That said, help is at hand. AWS billing is getting much more understandable, with the advent of such things as free Reserved Instance recommendations, the release of the Cost Explorer API and the rise of serverless technologies. For their part, Google's GCP and Microsoft's Azure learned from the early billing stumbles of AWS, and as a result, both have much more understandable cost structures. Additionally, there are a host of cost visibility Platform as a Service offerings out there; they all do more or less the same things as one another, but they're great for ad-hoc queries around your bill. If you'd rather build something you can control yourself, you can shove your billing information from all providers into an SQL database and run something like QuickSight or Tableau on top of it to aide visualization, as many shops do today.

    In return for this ridiculous pile of complexity, you get something rather special—the ability to spin up resources on-demand, for as little time as you need them, and pay only for the things that you use. It's incredible as a learning resource alone—imagine how much simpler it would have been in the late 1990s to receive a working Linux VM instead of having to struggle with Slackware's installation for the better part of a week. The cloud takes away, but it also gives.

    Corey Quinn

    Microsoft Announces First Custom Linux Kernel, German Government Chooses Open-Source Nextcloud and More

    2 days 10 hours ago
    Microsoft News kernel Cloud open source Events gaming

    News briefs for April 17, 2018.

    Microsoft yesterday introduced Azure Sphere, a Linux-based OS and cloud service for securing IoT devices. According to ZDNet, "Microsoft President Brad Smith introduced Azure Sphere saying, 'After 43 years, this is the first day that we are announcing, and will distribute, a custom Linux kernel.'"

    The German government's Federal Information Technology Centre (ITZBund) has chosen open-source Nextcloud for its self-hosted cloud solution, iwire reports. Nextcloud was chosen for its strict security requirements and scalability "both in terms of large numbers of uses and extensibility with additional features".

    European authorities have effectively ended the Whois public database of domain name registration, which ICANN oversees. According to The Register, the service isn't compliant with the GDPR and will be illegal as of May 25th: "ICANN now has a little over a month to come up with a replacement to the decades-old service that covers millions of domain names and lists the personal contact details of domain registrants, including their name, email and telephone number."

    A new release of PySoIFC, a free and open-source collection of more than 1,000 card Solitaire and Mahjong games, was announced recently. The new stable release, 2.2.0, is the first since 2009.

    Deadline for proposals to speak at Open Source Summit North America is April 29. OSSN is being held in Vancouver, BC, this year from August 29–31.

    In other event news, Red Hat today announced the keynote speakers and agenda for its largest ever Red Hat Summit being held at the Moscone Center in San Francisco, May 8–10.

    Jill Franklin

    Bassel Khartabil Free Fellowship, GNOME 3.28.1 Release, New Version of Mixxx and More

    3 days 9 hours ago
    Creative Commons GNOME Subversion multimedia Monitoring

    News briefs for April 16, 2018.

    The Bassel Khartabil Free Fellowship was awarded yesterday to Majd Al-shihabi, a Palestinian-Syrian engineer and urban planning graduate based in Beirut, Lebanon: "The Fellowship will support Majd's efforts in building a unified platform for Syrian and Palestinian oral history archives, as well as the digitizing and release of previously forgotten 1940s era public domain maps of Palestine." The Creative Commons also announced the first three winners of the Bassel Khartabil Memorial Fund: Egypt-based The Mosireen Collective, and Lebanon-based Sharq.org and ASI-REM/ADEF Lebanon. For all the details, see the announcement on the Creative Commons website.

    GNOME 3.28 is ready for prime time after receiving its first point release on Friday, which includes numerous improvements and bug fixes. See the announcement for all the details on version 3.28.1.

    Apache Subversion 1.10 has been released. This version is "a superset of all previous Subversion releases, and is as of the time of its release considered the current "best" release. Any feature or bugfix in 1.0.x through 1.9.x is also in 1.10, but 1.10 contains features and bugfixes not present in any earlier release. The new features will eventually be documented in a 1.10 version of the free Subversion book." New features include improved path-based authorization, new interactive conflict resolver, added support for LZ4 compression and more. See the release notes for more information.

    A new version of Mixxx, the free and open-source DJ software, was released today. Version 2.1 has "new and improved controller mappings, updated Deere and LateNight skins, overhauled effects system, and much more".

    Kayenta, a new open-source project from Google and Netflix for automated deployment monitoring was announced recently. GeekWire reports that the project's goal is "to help other companies that want to modernize their application deployment practices but don't exactly have the same budget and expertise to build their own solution."

    Jill Franklin

    Multiprocessing in Python

    3 days 10 hours ago
    Multiprocessing in Python Image Reuven M. Lerner Mon, 04/16/2018 - 09:20 HOW-TOs Programming python

    Python's "multiprocessing" module feels like threads, but actually launches processes.

    Many people, when they start to work with Python, are excited to hear that the language supports threading. And, as I've discussed in previous articles, Python does indeed support native-level threads with an easy-to-use and convenient interface.

    However, there is a downside to these threads—namely the global interpreter lock (GIL), which ensures that only one thread runs at a time. Because a thread cedes the GIL whenever it uses I/O, this means that although threads are a bad idea in CPU-bound Python programs, they're a good idea when you're dealing with I/O.

    But even when you're using lots of I/O, you might prefer to take full advantage of a multicore system. And in the world of Python, that means using processes.

    In my article "Launching External Processes in Python", I described how you can launch processes from within a Python program, but those examples all demonstrated that you can launch a program in an external process. Normally, when people talk about processes, they work much like they do with threads, but are even more independent (and with more overhead, as well).

    So, it's something of a dilemma: do you launch easy-to-use threads, even though they don't really run in parallel? Or, do you launch new processes, over which you have little control?

    The answer is somewhere in the middle. The Python standard library comes with "multiprocessing", a module that gives the feeling of working with threads, but that actually works with processes.

    So in this article, I look at the "multiprocessing" library and describe some of the basic things it can do.

    Multiprocessing Basics

    The "multiprocessing" module is designed to look and feel like the "threading" module, and it largely succeeds in doing so. For example, the following is a simple example of a multithreaded program:

    #!/usr/bin/env python3 import threading import time import random def hello(n): time.sleep(random.randint(1,3)) print("[{0}] Hello!".format(n)) for i in range(10): threading.Thread(target=hello, args=(i,)).start() print("Done!")

    In this example, there is a function (hello) that prints "Hello!" along with whatever argument is passed. It then runs a for loop that runs hello ten times, each of them in an independent thread.

    But wait. Before the function prints its output, it first sleeps for a few seconds. When you run this program, you then end up with output that demonstrates how the threads are running in parallel, and not necessarily in the order they are invoked:

    $ ./thread1.py Done! [2] Hello! [0] Hello! [3] Hello! [6] Hello! [9] Hello! [1] Hello! [5] Hello! [8] Hello! [4] Hello! [7] Hello!

    If you want to be sure that "Done!" is printed after all the threads have finished running, you can use join. To do that, you need to grab each instance of threading.Thread, put it in a list, and then invoke join on each thread:

    #!/usr/bin/env python3 import threading import time import random def hello(n): time.sleep(random.randint(1,3)) print("[{0}] Hello!".format(n)) threads = [ ] for i in range(10): t = threading.Thread(target=hello, args=(i,)) threads.append(t) t.start() for one_thread in threads: one_thread.join() print("Done!")

    The only difference in this version is it puts the thread object in a list ("threads") and then iterates over that list, joining them one by one.

    But wait a second—I promised that I'd talk about "multiprocessing", not threading. What gives?

    Well, "multiprocessing" was designed to give the feeling of working with threads. This is so true that I basically can do some search-and-replace on the program I just presented:

    • threading → multiprocessing
    • Thread → Process
    • threads → processes
    • thread → process

    The result is as follows:

    #!/usr/bin/env python3 import multiprocessing import time import random def hello(n): time.sleep(random.randint(1,3)) print("[{0}] Hello!".format(n)) processes = [ ] for i in range(10): t = multiprocessing.Process(target=hello, args=(i,)) processes.append(t) t.start() for one_process in processes: one_process.join() print("Done!")

    In other words, you can run a function in a new process, with full concurrency and take advantage of multiple cores, with multiprocessing.Process. It works very much like a thread, including the use of join on the Process objects you create. Each instance of Process represents a process running on the computer, which you can see using ps, and which you can (in theory) stop with kill.

    What's the Difference?

    What's amazing to me is that the API is almost identical, and yet two very different things are happening behind the scenes. Let me try to make the distinction clearer with another pair of examples.

    Perhaps the biggest difference, at least to anyone programming with threads and processes, is the fact that threads share global variables. By contrast, separate processes are completely separate; one process cannot affect another's variables. (In a future article, I plan to look at how to get around that.)

    Here's a simple example of how a function running in a thread can modify a global variable (note that what I'm doing here is to prove a point; if you really want to modify global variables from within a thread, you should use a lock):

    #!/usr/bin/env python3 import threading import time import random mylist = [ ] def hello(n): time.sleep(random.randint(1,3)) mylist.append(threading.get_ident()) # bad in real code! print("[{0}] Hello!".format(n)) threads = [ ] for i in range(10): t = threading.Thread(target=hello, args=(i,)) threads.append(t) t.start() for one_thread in threads: one_thread.join() print("Done!") print(len(mylist)) print(mylist)

    The program is basically unchanged, except that it defines a new, empty list (mylist) at the top. The function appends its ID to that list and then returns.

    Now, the way that I'm doing this isn't so wise, because Python data structures aren't thread-safe, and appending to a list from within multiple threads eventually will catch up with you. But the point here isn't to demonstrate threads, but rather to contrast them with processes.

    When I run the above code, I get:

    $ ./th-update-list.py [0] Hello! [2] Hello! [6] Hello! [3] Hello! [1] Hello! [4] Hello! [5] Hello! [7] Hello! [8] Hello! [9] Hello! Done! 10 [123145344081920, 123145354592256, 123145375612928, ↪123145359847424, 123145349337088, 123145365102592, ↪123145370357760, 123145380868096, 123145386123264, ↪123145391378432]

    So, you can see that the global variable mylist is shared by the threads, and that when one thread modifies the list, that change is visible to all the other threads.

    But if you change the program to use "multiprocessing", the output looks a bit different:

    #!/usr/bin/env python3 import multiprocessing import time import random import os mylist = [ ] def hello(n): time.sleep(random.randint(1,3)) mylist.append(os.getpid()) print("[{0}] Hello!".format(n)) processes = [ ] for i in range(10): t = multiprocessing.Process(target=hello, args=(i,)) processes.append(t) t.start() for one_process in processes: one_process.join() print("Done!") print(len(mylist)) print(mylist)

    Aside from the switch to multiprocessing, the biggest change in this version of the program is the use of os.getpid to get the current process ID.

    The output from this program is as follows:

    $ ./proc-update-list.py [0] Hello! [4] Hello! [7] Hello! [8] Hello! [2] Hello! [5] Hello! [6] Hello! [9] Hello! [1] Hello! [3] Hello! Done! 0 []

    Everything seems great until the end when it checks the value of mylist. What happened to it? Didn't the program append to it?

    Sort of. The thing is, there is no "it" in this program. Each time it creates a new process with "multiprocessing", each process has its own value of the global mylist list. Each process thus adds to its own list, which goes away when the processes are joined.

    This means the call to mylist.append succeeds, but it succeeds in ten different processes. When the function returns from executing in its own process, there is no trace left of the list from that process. The only mylist variable in the main process remains empty, because no one ever appended to it.

    Queues to the Rescue

    In the world of threaded programs, even when you're able to append to the global mylist variable, you shouldn't do it. That's because Python's data structures aren't thread-safe. Indeed, only one data structure is guaranteed to be thread safe—the Queue class in the multiprocessing module.

    Queues are FIFOs (that is, "first in, first out"). Whoever wants to add data to a queue invokes the put method on the queue. And whoever wants to retrieve data from a queue uses the get command.

    Now, queues in the world of multithreaded programs prevent issues having to do with thread safety. But in the world of multiprocessing, queues allow you to bridge the gap among your processes, sending data back to the main process. For example:

    #!/usr/bin/env python3 import multiprocessing import time import random import os from multiprocessing import Queue q = Queue() def hello(n): time.sleep(random.randint(1,3)) q.put(os.getpid()) print("[{0}] Hello!".format(n)) processes = [ ] for i in range(10): t = multiprocessing.Process(target=hello, args=(i,)) processes.append(t) t.start() for one_process in processes: one_process.join() mylist = [ ] while not q.empty(): mylist.append(q.get()) print("Done!") print(len(mylist)) print(mylist)

    In this version of the program, I don't create mylist until late in the game. However, I create an instance of multiprocessing.Queue very early on. That Queue instance is designed to be shared across the different processes. Moreover, it can handle any type of Python data that can be stored using "pickle", which basically means any data structure.

    In the hello function, it replaces the call to mylist.append with one to q.put, placing the current process' ID number on the queue. Each of the ten processes it creates will add its own PID to the queue.

    Note that this program takes place in stages. First it launches ten processes, then they all do their work in parallel, and then it waits for them to complete (with join), so that it can process the results. It pulls data off the queue, puts it onto mylist, and then performs some calculations on the data it has retrieved.

    The implementation of queues is so smooth and easy to work with, it's easy to forget that these queues are using some serious behind-the-scenes operating system magic to keep things coordinated. It's easy to think that you're working with threading, but that's just the point of multiprocessing; it might feel like threads, but each process runs separately. This gives you true concurrency within your program, something threads cannot do.

    Conclusion

    Threading is easy to work with, but threads don't truly execute in parallel. Multiprocessing is a module that provides an API that's almost identical to that of threads. This doesn't paper over all of the differences, but it goes a long way toward making sure things aren't out of control.

    Reuven M. Lerner

    FOSS Project Spotlight: Ravada

    6 days 6 hours ago
    FOSS Project Spotlight: Ravada Image Francesc Guasch Fri, 04/13/2018 - 13:48 Desktop FOSS KVM Virtualization

    Ravada is an open-source project that allows users to connect to a virtual desktop.

    Currently, it supports KVM, but its back end has been designed and implemented in order to allow future hypervisors to be added to the framework. The client's only requirements are a web-browser and a remote viewer supporting the spice protocol.

    Ravada's main features include:

    • KVM back end.
    • LDAP and SQL authentication.
    • Kiosk mode.
    • Remote access for Windows and Linux.
    • Light and fast virtual machine clones for each user.
    • Instant clone creation.
    • USB redirection.
    • Easy and customizable end-user interface (i18n, l10n).
    • Administration from a web browser.

    It's very easy to install and use. Following the documentation, virtual machines can be deployed in minutes. It's an early release, but it's already used in production. The project is open source, and you can download the code from GitHub. Contributions welcome!

    Francesc Guasch

    Elisa Music Player Debuts, Zenroom Crypto-Language VM Reaches Version 0.5.0 and More

    6 days 10 hours ago
    KDE multimedia Security ZFS System76 GNOME News

    News briefs for April 13, 2018.

    The Elisa music player, developed by the KDE community, debuted yesterday, with version 0.1. Elisa has good integration wtih the Plasma desktop and also supports other Linux desktop environments, as well as Windows and Android. In addition, the Elisa release announcement notes, "We are creating a reliable product that is a joy to use and respects our users' privacy. As such, we will prefer to support online services where users are in control of their data."

    Mozilla released Firefox 11.0 for iOS yesterday, and this new version turns on tracking protection by default. The feature uses a list provided by Disconnect to identify trackers, and it also provides options for turning it on or off overall or for specific websites.

    The Zenroom project, a brand-new crypto-language virtual machine, has reached version 0.5.0. Zenroom's goal is "improving people's awareness of how their data is processed by algorithms, as well facilitate the work of developers to create and publish algorithms that can be used both client and server side." In addition, it "has no external dependencies, is smaller than 1MB, runs in less than 64KiB memory and is ready for experimental use on many target platforms: desktop, embedded, mobile, cloud and browsers." The program is free software and is licensed under the GNU LGPL v3. Its main use case is "distributed computing of untrusted code where advanced cryptographic functions are required".

    ZFS On Linux, recently in the news for data-loss issues, may finally be getting SSD TRIM support, which has been in the works for years, according to Phoronix.

    System76 recently became a GNOME Foundation Advisory Board member. Neil McGovern, Executive Director of the GNOME Foundation, commented "System76's long-term ambition to see free software grow is highly commendable, and we're extremely pleased that they're coming on board to help support the Foundation and the community." See the betanews article for more details.

    Jill Franklin

    Facebook Compartmentalization

    1 week ago
    Facebook Compartmentalization Image Kyle Rankin Thu, 04/12/2018 - 10:06 Facebook Tor Privacy Security

    I don't always use Facebook, but when I do, it's over a compartmentalized browser over Tor.

    Whenever people talk about protecting privacy on the internet, social-media sites like Facebook inevitably come up—especially right now. It makes sense—social networks (like Facebook) provide a platform where you can share your personal data with your friends, and it doesn't come as much of a surprise to people to find out they also share that data with advertisers (it's how they pay the bills after all). It makes sense that Facebook uses data you provide when you visit that site. What some people might be surprised to know, however, is just how much. Facebook tracks them when they aren't using Facebook itself but just browsing around the web.

    Some readers may solve the problem of Facebook tracking by saying "just don't use Facebook"; however, for many people, that site may be the only way they can keep in touch with some of their friends and family members. Although I don't post on Facebook much myself, I do have an account and use it to keep in touch with certain friends. So in this article, I explain how I employ compartmentalization principles to use Facebook without leaking too much other information about myself.

    1. Post Only Public Information

    The first rule for Facebook is that, regardless of what you think your privacy settings are, you are much better off if you treat any content you provide there as being fully public. For one, all of those different privacy and permission settings can become complicated, so it's easy to make a mistake that ends up making some of your data more public than you'd like. Second, even with privacy settings in place, you don't have a strong guarantee that the data won't be shared with people willing to pay for it. If you treat it like a public posting ground and share only data you want the world to know, you won't get any surprises.

    2. Give Facebook Its Own Browser

    I mentioned before that Facebook also can track what you do when you browse other sites. Have you ever noticed little Facebook "Like" icons on other sites? Often websites will include those icons to help increase engagement on their sites. What it also does, however, is link the fact that you visited that site with your specific Facebook account—even if you didn't click "Like" or otherwise engage with the site. If you want to reduce how much you are tracked, I recommend selecting a separate browser that you use only for Facebook. So if you are a Firefox user, load Facebook in Chrome. If you are a Chrome user, view Facebook in Firefox. If you don't want to go to the trouble of managing two different browsers, at the very least, set up a separate Firefox profile (run firefox -P from a terminal) that you use only for Facebook.

    3. View Facebook over Tor

    Many people don't know that Facebook itself offers a .onion service that allows you you to view Facebook over Tor. It may seem counterintuitive that a site that wants so much of your data would also want to use an anonymizing service, but it makes sense if you think it through. Sure, if you access Facebook over Tor, Facebook will know it's you that's accessing it, but it won't know from where. More important, no other sites on the internet will know you are accessing Facebook from that account, even if they try to track via IP.

    To use Facebook's private .onion service, install the Tor Browser Bundle, or otherwise install Tor locally, and follow the Tor documentation to route your Facebook-only browser to its SOCKS proxy service. Then visit https://facebookcorewwwi.onion, and only you and Facebook will know you are hitting the site. By the way, one advantage to setting up a separate browser that uses a SOCKS proxy instead of the Tor Browser Bundle is that the Tor Browser Bundle attempts to be stateless, so you will have a tougher time making the Facebook .onion address your home page.

    Conclusion

    So sure, you could decide to opt out of Facebook altogether, but if you don't have that luxury, I hope a few of these compartmentalization steps will help you use Facebook in a way that doesn't completely remove your privacy.

    Kyle Rankin

    Mozilla's Internet Health Report, Google's Fuchsia, Purism Development Docs and More

    1 week ago
    Mozilla Privacy kernel Google Purism

    News briefs for April 12, 2018.

    Mozilla recently published its annual Internet Health Report. Its three major concerns are:

    • "Consolidation of power over the Internet, particularly by Facebook, Google, Tencent, and Amazon."
    • "The spread of 'fake news,' which the report attributes in part to the 'broken online advertising economy' that provides financial incentive for fraud, misinformation, and abuse."
    • The threat to privacy posed by the poor security of the Internet of Things.

    (Source: Ars Technica's "The Internet has serious health problems, Mozilla Foundation report finds")

    Idle power on some Linux systems could drop by 10% or more with the Linux 4.17 kernel, reports Phoronix. Evidently, that's not all that's in the works regarding power management features: "performance of workloads where the idle loop overhead was previously significant could now see greater gains too". See Rafael Wysocki's "More power management updates for v4.17-rc-1" pull request.

    Google's "not-so-secret" operating system named Fuchsia that's been in development for almost two years has attracted much speculation, but now we finally know what it is not. It's not Linux. According to a post on xda, Google published a documentation page called "the book" that explains what Fuchsia is and isn't. Several details still need to be filled in, but documentation will be added as things develop.

    Instagram will soon allow users to download their data, including photos, videos and messages, according to a TechCrunch report: "This tool could make it much easier for users to leave Instagram and go to a competing image social network. And as long as it launches before May 25th, it will help Instagram to comply with upcoming European GDPR privacy law that requires data portability."

    Purism has started its developer docs effort in anticipation of development boards being shipped this summer. According to the post on the Purism website, "There will be technical step-by-step instructions that are suitable for both newbies and experienced Debian developers alike. The goal of the docs is to openly welcome you and light your path along the way with examples and links to external documentation." You can see the docs here.

    Jill Franklin

    Promote Drupal Initiative Announced at DrupalCon

    1 week 1 day ago
    Promote Drupal Initiative Announced at DrupalCon Image Katherine Druckman Wed, 04/11/2018 - 11:03 Drupal

    Yesterday's Keynote from Drupal project founder, Dries Buytaert, kicked off the annual North American gathering of Drupalists from around the world, and also kicked off a new Drupal community initiative aimed at promoting the Drupal platform through a coordinated marketing effort using funds raised within the community.

    The Drupal Association hopes to raise $100,000 to enable a global group of staff and volunteers to complete the first two phases of a four-phase plan to create consistent and reusable marketing materials to allow agencies and other Drupal promoters to communicate Drupal's benefits to organizations and potential customers quickly and effectively.

    Convincing non-geeks and non-technical decision-makers of Drupal's strengths has always been a pain point, and we'll be watching with great interest as this initiative progresses.

    Also among the announcements were demonstrations of how easy it could soon be to manipulate content within the Drupal back end using a drag-and-drop interface, which would provide great flexibility for site builders and content editors.

    We also expect to see improvements to the Drupal site-builder experience in upcoming releases, as well as improvements to the built-in configuration management process, which eases the deployment process when developing in Drupal.

    See the full keynote to get inspired by what's to come in the Drupalverse.

    And also see the DrupalCon Nashville Playlist!

    Katherine Druckman

    OSI's Simon Phipps on Open Source's Past and Future

    1 week 1 day ago
    OSI's Simon Phipps on Open Source's Past and Future Image Christine Hall Wed, 04/11/2018 - 09:20 open source

    With an eye on the future, the Open Source Initiative's president sits down and talks with Linux Journal about the organization's 20-year history.

    It would be difficult for anyone who follows Linux and open source to have missed the 20th birthday of open source in early February. This was a dual celebration, actually, noting the passing of 20 years since the term "open source" was first coined and since the formation of the Open Source Initiative (OSI), the organization that decides whether software licenses qualify to wear that label.

    The party came six months or so after Facebook was successfully convinced by the likes of the Apache Foundation; WordPress's developer, Automatic; the Free Software Foundation (FSF); and OSI to change the licensing of its popular React project away from the BSD + Patents license, a license that had flown under the radar for a while.

    The brouhaha began when Apache developers noticed a term in the license forbidding the suing of Facebook over any patent issues, which was troublesome because it gave special consideration to a single entity, Facebook, which pretty much disqualified it from being an open-source license.

    Although the incident worked out well—after some grumblings Facebook relented and changed the license to MIT—the Open Source Initiative fell under some criticism for having approved the BSD + Patents license, with some people suggesting that maybe it was time for OSI to be rolled over into an organization such as the Linux Foundation.

    The problem was that OSI had never approved the BSD + Patents.

    Simon Phipps delivers the keynote at Kopano Conference 2017 in Arnhem, the Netherlands.

    "BSD was approved as a license, and Facebook decided that they would add the software producer equivalent of a signing statement to it", OSI's president, Simon Phipps, recently explained to Linux Journal. He continued:

    They decided they would unilaterally add a patent grant with a defensive clause in it. They found they were able to do that for a while simply because the community accepted it. Over time it became apparent to people that it was actually not an acceptable patent grant, that it unduly favored Facebook and that if it was allowed to grow to scale, it would definitely create an environment where Facebook was unfairly advantaged.

    He added that the Facebook incident was actually beneficial for OSI and ended up being a validation of the open-source approval process:

    I think the consequence of that encounter is that more people are now convinced that the whole licensing arrangement that open-source software is under needs to be approved at OSI.

    I think prior to that, people felt it was okay for there just to be a license and then for there to be arbitrary additional terms applied. I think that the consensus of the community has moved on from that. I think it would be brave for a future software producer to decide that they can add arbitrary terms unless those arbitrary terms are minimally changing the rights and benefits of the community.

    As for the notion that OSI should be folded into a larger organization such as the Linux Foundation?

    "When I first joined OSI, which was back in 2009 I think, I shared that view", Phipps said. He continued:

    I felt that OSI had done its job and could be put into an existing organization. I came to believe that wasn't the case, because the core role that OSI plays is actually a specialist role. It's one that needs to be defined and protected. Each of the organizations I could think of where OSI could be hosted would almost certainly not be able to give the role the time and attention it was due. There was a risk there would be a capture of that role by an actor who could not be trusted to conduct it responsibly.

    That risk of the license approval role being captured is what persuaded me that I needed to join the OSI board and that I needed to help it to revamp and become a member organization, so that it could protect the license approval role in perpetuity. That's why over the last five to six years, OSI has dramatically changed.

    This is Phipps' second go at being president at OSI. He originally served in the position from 2012 until 2015, when he stepped down in preparation for the end of his term on the organization's board. He returned to the position last year after his replacement, Allison Randal, suddenly stepped down to focus on her pursuit of a PhD.

    His return was pretty much universally seen in a positive light. During his first three-year stint, the organization moved toward a membership-based governance structure and started an affiliate membership program for nonprofit charitable organizations, industry associations and academic institutions. This eventually led to an individual membership program and the inclusion of corporate sponsors.

    Although OSI is one of the best known open-source organizations, its grassroots approach has helped keep it on the lean side, especially when compared to organizations like the behemoth Linux or Mozilla Foundations. Phipps, for example, collects no salary for performing his presidential duties. Compare that with the Linux Foundation's executive director, Jim Zemlin, whose salary in 2010 was reportedly north of $300,000.

    "We're a very small organization actually", Phipps said. He added:

    We have a board of directors of 11 people and we have one paid employee. That means the amount of work we're likely do behind the scenes has historically been quite small, but as time is going forward, we're gradually expanding our reach. We're doing that through working groups and we're doing that through bringing together affiliates for particular projects.

    While the public perception might be that OSI's role is merely the approval of open-source licenses, Phipps sees a larger picture. According to him, the point of all the work OSI does, including the approval process, is to pave the way to make the road smoother for open-source developers:

    The role that OSI plays is to crystallize consensus. Rather than being an adjudicator that makes decisions ex cathedra, we're an organization that provides a venue for people to discuss licensing. We then identify consensus as it arises and then memorialize that consensus. We're more speaker-of-the-house than king.

    That provides an extremely sound way for people to reduce the burden on developers of having to evaluate licensing. As open source becomes more and more the core of the way businesses develop software, it's more and more valuable to have that crystallization of consensus process taking out the uncertainty for people who are needing to work between different entities. Without that, you need to constantly be seeking legal advice, you need to constantly be having discussions about whether a license meets the criteria for being open source or not, and the higher uncertainty results in fewer contributions and less collaboration.

    One of OSI's duties, and one it has in common with organizations such as the Free Software Foundation (FSF), is that of enforcer of compliance issues with open-source licenses. Like the FSF, OSI prefers to take a carrot rather than stick approach. And because it's the organization that approves open-source licenses, it's in a unique position to nip issues in the bud. Those issues can run the gamut from unnecessary licenses to freeware masquerading as open source. According to Phipps:

    We don't do that in private. We do that fairly publicly and we don't normally need to do that. Normally a member of the license review mailing list, who are all simply members of the community, will go back to people and say "we don't think that's distinctive", "we don't think that's unique enough", "why didn't you use license so and so", or they'll say, "we really don't think your intent behind this license is actually open source." Typically OSI doesn't have to go and say those things to people.

    The places where we do get involved in speaking to people directly is where they describe things as open source when they haven't bothered to go through that process and that's the point at which we'll communicate with people privately.

    The problem of freeware—proprietary software that's offered without cost—being marketed under the open-source banner is particularly troublesome. In those cases, OSI definitely will reach out and contact the offending companies, as Phipps says, "We do that quite often, and we have a good track record of helping people understand why it's to their business disadvantage to behave in that way."

    One of the reasons why OSI is able to get commercial software developers to heed its advice might be because the organization has never taken an anti-business stance. Founding member Michael Tiemann, now VP of open-source affairs at Red Hat, once said that one of the reasons the initiative chose the term "open source" was to "dump the moralizing and confrontational attitude that had been associated with 'free software' in the past and sell the idea strictly on the same pragmatic, business-case grounds that had motivated Netscape."

    These days, the organization has ties with many major software vendors and receives most of its financial support from corporate sponsors. However, it has taken steps to ensure that corporate sponsors don't dictate OSI policy. According to Phipps:

    If you want to join a trade association, that's what the Linux Foundation is there for. You can go pay your membership fees and buy a vote there, but OSI is a 501(c)(3). That's means it's a charity that's serving the public's interest and the public benefit.

    It would be wrong for us to allow OSI to be captured by corporate interests. When we conceived the sponsorship scheme, we made sure that there was no risk that would happen. Our corporate sponsors do not get any governance role in the organization. They don't get a vote over what's happening, and we've been very slow to accept new corporate sponsors because we wanted to make sure that no one sponsor could have an undue influence if they decided that they no longer liked us or decided to stop paying the sponsorship fees.

    This pragmatic approach, which also puts "permissive" licenses like Apache and MIT on equal footing with "copyleft" licenses like the GPL, has traditionally not been met with universal approval from FOSS advocates. The FSF's Richard Stallman has been critical of the organization, although noting that his organization and OSI are essentially on the same page. Years ago, OSI co-founder and creator of The Open Source Definition, Bruce Perens, decried the "schism" between the Free Software and Open Source communities—a schism that Phipps seeks to narrow:

    As I've been giving keynotes about the first 20 years and the next ten years of open source, I've wanted to make very clear to people that open source is a progression of the pre-existing idea of free software, that there is no conflict between the idea of free software and the way it can be adopted for commercial or for more structured use under the term open source.

    One of the things that I'm very happy about over the last five to six years is the good relations we've been able to have with the Free Software Foundation Europe. We've been able to collaborate with them over amicus briefs in important lawsuits. We are collaborating with them over significant issues, including privacy and including software patents, and I hope in the future that we'll be able to continue cooperating and collaborating. I think that's an important thing to point out, that I want the pre-existing world of free software to have its due credit.

    Software patents represent one of several areas into which OSI has been expanding. Patents have long been a thorny issue for open source, because they have the potential to affect not only people who develop software, but also companies who merely run open-source software on their machines. They also can be like a snake in the grass; any software application can be infringing on an unknown patent. According to Phipps:

    We have a new project that is just getting started, revisiting the role of patents and standards. We have helped bring together a post-graduate curriculum on open source for educating graduates on how to develop open-source software and how to understand it.

    We also host other organizations that need a fiduciary host so that they don't have to do their own bookkeeping and legal filings. For a couple years, we hosted the Open Hatch Project, which has now wound up, and we host other activities. For example, we host the mailing lists for the California Association of Voting Officials, who are trying to promote open-source software in voting machines in North America.

    Like everyone else in tech these days, OSI is also grappling with diversity issues. Phipps said the organization is seeking to deal with that issue by starting at the membership level:

    At the moment I feel that I would very much like to see a more diverse membership. I'd like to see us more diverse geographically. I'd like to see us more diverse in terms of the ethnicity and gender of the people who are involved. I would like to see us more diverse in terms of the businesses from which people are employed.

    I'd like to see all those improve and so, over the next few years (assuming that I remain president because I have to be re-elected every year by the board) that will also be one of the focuses that I have.

    And to wrap things up, here's how he plans to go about that:

    This year is the anniversary year, and we've been able to arrange for OSI to be present at a conference pretty much every month, in some cases two or three per month, and the vast majority of those events are global. For example, FOSSASIA is coming up, and we're backing that. We are sponsoring a hostel where we'll be having 50 software developers who are able to attend FOSSASIA because of the sponsorship. Our goal here is to raise our profile and to recruit membership by going and engaging with local communities globally. I think that's going to be a very important way that we do it.

    Christine Hall

    Red Hat Enterprise Linux 7.5 Released, Valve Improves Steam Privacy Settings, New Distribution Specification Project for Containers and More

    1 week 1 day ago
    News Red Hat Distributions Containers Security gaming Privacy

    News briefs for April 11, 2018.

    Red Hat Enterprise Linux 7.5 was released yesterday. New features include "enhanced security and compliance, usability at scale, continued integration with Windows infrastructure on-premise and in Microsoft Azure, and new functionality for storage cost controls. The release also includes continued investment in platform manageability for Linux beginners, experts, and Microsoft Windows administrators." See the release notes for more information.

    The Open Container Initiative (OCI) yesterday announced the launch of the Distribution Specification Project: "having a solid, common distribution specification with conformance testing will ensure long lasting security and interoperability throughout the container ecosystem". See also "Open Container Initiative nails down container image distribution standard" on ZDNet for more details.

    Valve is offering new and improved privacy settings for Steam users, providing more detailed descriptions of the settings so you can better manage what your friends and the wider Steam community see. The announcement notes, "Additionally, regardless of which setting you choose for your profile's game details, you now have the option to keep your total game playtime private. You no longer need to nervously laugh it off as a bug when your friends notice the 4,000+ hours you've put into Ricochet."

    Thousands of websites have been hacked to give "fake update notifications to install banking malware and remote access trojans on visitors' computers", according to computer researcher Malwarebytes. Ars Technica reports that "The attackers also fly under the radar by using highly obfuscated JavaScript. Among the malicious software installed in the campaign was the Chthonic banking malware and a commercial remote access trojan known as NetSupport."

    Krita 4.0.1 was released yesterday. This new version fixes more than 50 bugs since the 4.0 release and includes many improvements to the UI.

    Jill Franklin

    Simple Cloud Hardening

    1 week 2 days ago
    Simple Cloud Hardening Image Kyle Rankin Tue, 04/10/2018 - 10:30 AWS Cloud Security

    Apply a few basic hardening principles to secure your cloud environment.

    I've written about simple server-hardening techniques in the past. Those articles were inspired in part by the Linux Hardening in Hostile Networks book I was writing at the time, and the idea was to distill the many different hardening steps you might want to perform on a server into a few simple steps that everyone should do. In this article, I take the same approach only with a specific focus on hardening cloud infrastructure. I'm most familiar with AWS, so my hardening steps are geared toward that platform and use AWS terminology (such as Security Groups and VPC), but as I'm not a fan of vendor lock-in, I try to include steps that are general enough that you should be able to adapt them to other providers.

    New Accounts Are (Relatively) Free; Use Them

    One of the big advantages with cloud infrastructure is the ability to compartmentalize your infrastructure. If you have a bunch of servers racked in the same rack, it might be difficult, but on cloud infrastructures, you can take advantage of the technology to isolate one customer from another to isolate one of your infrastructure types from the others. Although this doesn't come completely for free (it adds some extra overhead when you set things up), it's worth it for the strong isolation it provides between environments.

    One of the first security measures you should put in place is separating each of your environments into its own high-level account. AWS allows you to generate a number of different accounts and connect them to a central billing account. This means you can isolate your development, staging and production environments (plus any others you may create) completely into their own individual accounts that have their own networks, their own credentials and their own roles totally isolated from the others. With each environment separated into its own account, you limit the damage attackers can do if they compromise one infrastructure to just that account. You also make it easier to see how much each environment costs by itself.

    In a traditional infrastructure where dev and production are together, it is much easier to create accidental dependencies between those two environments and have a mistake in one affect the other. Splitting environments into separate accounts protects them from each other, and that independence helps you identify any legitimate links that environments need to have with each other. Once you have identified those links, it's much easier to set up firewall rules or other restrictions between those accounts, just like you would if you wanted your infrastructure to talk to a third party.

    Lock Down Security Groups

    One advantage to cloud infrastructure is that you have a lot tighter control over firewall rules. AWS Security Groups let you define both ingress and egress firewall rules, both with the internet at large and between Security Groups. Since you can assign multiple Security Groups to a host, you have a lot of flexibility in how you define network access between hosts.

    My first recommendation is to deny all ingress and egress traffic by default and add specific rules to a Security Group as you need them. This is a fundamental best practice for network security, and it applies to Security Groups as much as to traditional firewalls. This is particularly important if you use the Default security group, as it allows unrestricted internet egress traffic by default, so that should be one of the first things to disable. Although disabling egress traffic to the internet by default can make things a bit trickier to start with, it's still a lot easier than trying to add that kind of restriction after the fact.

    You can make things very complicated with Security Groups; however, my recommendation is to try to keep them simple. Give each server role (for instance web, application, database and so on) its own Security Group that applies to each server in that role. This makes it easy to know how your firewall rules are being applied and to which servers they apply. If one server in a particular role needs different network permissions from the others, it's a good sign that it probably should have its own role.

    The role-based Security Group model works pretty well but can be inconvenient when you want a firewall rule to apply to all your hosts. For instance, if you use centralized configuration management, you probably want every host to be allowed to talk to it. For rules like this, I take advantage of the Default Security Group and make sure that every host is a member of it. I then use it (in a very limited way) as a central place to define any firewall rules I want to apply to all hosts. One rule I define in particular is to allow egress traffic to any host in the Default Security Group—that way I don't have to write duplicate ingress rules in one group and egress rules in another whenever I want hosts in one Security Group to talk to another.

    Use Private Subnets

    On cloud infrastructure, you are able to define hosts that have an internet-routable IP and hosts that only have internal IPs. In AWS Virtual Private Cloud (VPC), you define these hosts by setting up a second set of private subnets and spawning hosts within those subnets instead of the default public subnets.

    Treat the default public subnet like a DMZ and put hosts there only if they truly need access to the internet. Put all other hosts into the private subnet. With this practice in place, even if hosts in the private subnet were compromised, they couldn't talk directly to the internet even if an attacker wanted them to, which makes it much more difficult to download rootkits or other persistence tools without setting up elaborate tunnels.

    These days it seems like just about every service wants unrestricted access to web ports on some other host on the internet, but an advantage to the private subnet approach is that instead of working out egress firewall rules to specific external IPs, you can set up a web proxy service in your DMZ that has more broad internet access and then restrict the hosts in the private subnet by hostname instead of IP. This has an added benefit of giving you a nice auditing trail on the proxy host of all the external hosts your infrastructure is accessing.

    Use Account Access Control Lists Minimally

    AWS provides a rich set of access control list tools by way of IAM. This lets you set up very precise rules about which AWS resources an account or role can access using a very complicated syntax. While IAM provides you with some pre-defined rules to get you started, it still suffers from the problem all rich access control lists have—the complexity makes it easy to create mistakes that grant people more access than they should have.

    My recommendation is to use IAM only as much as is necessary to lock down basic AWS account access (like sysadmin accounts or orchestration tools for instance), and even then, to keep the IAM rules as simple as you can. If you need to restrict access to resources further, use access control at another level to achieve it. Although it may seem like giving somewhat broad IAM permissions to an AWS account isn't as secure as drilling down and embracing the principle of least privilege, in practice, the more complicated your rules, the more likely you will make a mistake.

    Conclusion

    Cloud environments provide a lot of complex options for security; however, it's more important to set a good baseline of simple security practices that everyone on the team can understand. This article provides a few basic, common-sense practices that should make your cloud environments safer while not making them too complex.

    Kyle Rankin