Linux Journal

Eclipse Foundation's New Open-Source Governance Model for Jakarta EE, Turris MOX Modular Router Campaign and More

2 months 4 weeks ago
News open source Cloud Android Google SUSE GNOME

News briefs for April 24, 2018.

The Eclipse Foundation announced today a new open-source governance model and "a 'cloud native Java' path forward for Jakarta EE, the new community-led platform created from the contribution of Java EE." According to the press release, with this move to the community-driven open-source governance model, "Jakarta EE promises faster release and innovation cycles." See https://jakarta.ee for more details or to join the Jakarta EE Working Group.

A Czech Republic company has launched an Indiegogo campaign to create a modular and open-source router called Turris MOX. The Turris MOX team claims it's "probably the first router configurable as easily as a sandwich. Choose the parts you actually want." (Source: It's FOSS.)

Amnesty International's Technology and Human Rights researcher Joe Westby recently said that Google "shows total contempt for Android users' privacy" in reference to the launch of its new "Chat" messaging service: "With its baffling decision to launch a messaging service without end-to-end encryption, Google has shown utter contempt for the privacy of Android users and handed a precious gift to cybercriminals and government spies alike, allowing them easy access to the content of Android users' communications." (Source: BleepingComputer.)

SUSE recently announced that SUSE Linux Enterprise High Performance Computing is in the SLE 15 Beta program: "as part of the (upcoming) major version of SUSE Linux Enterprise 15, SUSE Linux Enterprise High Performance Computing (HPC) is now a dedicated SLE product base on SLES 15. This product is available for the x86_64 and ARM aarch64 hardware platforms." For more info on the SLE 15 Public Beta program, visit here.

According to Phoronix, the GNOME shell memory leak has been fixed: "the changes are currently staged in Git for what will become GNOME 3.30 but might also be backported to 3.28". See GNOME developer Georges Stravracas' blog post for more details on the leak and its fix.

Jill Franklin

Heptio Announces Gimbal, Netflix Open-Sources Titus, Linux 4.15 Reaches End of Life and More

2 months 4 weeks ago
News Cloud Containers AWS Kubernetes OpenStack Gimbal kernel multimedia FFmpeg

News briefs for April 23, 2018.

Heptio this morning announces Gimbal, "an open source initiative to unify and scale the flow of network traffic into hybrid environments consisting of multiple Kubernetes clusters and traditional infrastructure technologies including OpenStack". The initiative is in collaboration with Actapio, a subsidiary of Yahoo Japan Corporation, and according to Craig McLuckie, founder and CEO of Heptio, "This collaboration demonstrates the full potential of cloud native technologies and open source as a way to not only manage applications, but address broader infrastructure considerations."

Netflix open-sources its Titus container management system. According to Christine Hall's DataCenter Knowledge article, Titus is tightly integrated with AWS, and it "launches as many as three million containers per week, to host thousands of applications over seven regionally isolated stacks across tens of thousands of EC2 virtual machines."

Linux 4.15 has reached end of life. Greg Kroah-Hartman announced on the LKML that if you're still using the 4.15 kernel series, it's time to upgrade to the 4.16.y kernel tree.

GNU Parallel 20180422 was released yesterday. GNU Parallel is "a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input". More info is available here.

FFmpeg has a new major release: version 4.0 "Wu". Some highlights include "bitstream filters for editing metadata in H.264, HEVC and MPEG-2 streams", experimental Magic YUV encoder, TiVo ty/ty+ demuxer and more.

Jill Franklin

Userspace Networking with DPDK

2 months 4 weeks ago
Userspace Networking with DPDK Image Rami Rosen Mon, 04/23/2018 - 07:07 Networking kernel Virtualization

DPDK is a fully open-source project that operates in userspace. It's a multi-vendor and multi-architecture project, and it aims at achieving high I/O performance and reaching high packet processing rates, which are some of the most important features in the networking arena. It was created by Intel in 2010 and moved to the Linux Foundation in April 2017. This move positioned it as one of the most dominant and most important open-source Linux projects. DPDK was created for the telecom/datacom infrastructure, but today, it's used almost everywhere, including the cloud, data centers, appliances, containers and more. In this article, I present a high-level overview of the project and discuss features that were released in DPDK 17.08 (August 2017).

Undoubtedly, a lot of effort in many networking projects is geared toward achieving high speed and high performance. Several factors contribute to achieving this goal with DPDK. One is that DPDK is a userspace application that bypasses the heavy layers of the Linux kernel networking stack and talks directly to the network hardware. Another factor is usage of memory hugepages. By using hugepages (of 2MB or 1GB in size), a smaller number of memory pages is needed than when using standard memory pages (which in many platforms are 4k in size). As a result, the number of Translation Lookaside Buffers (TLBs) misses is reduced significantly, and performance is increased. Yet another factor is that low-level optimizations are done in the code, some of them related to memory cache line alignment, aiming at achieving optimal cache use, prefetching and so on. (Delving into the technical details of those optimizations is outside the scope of this article.)

DPDK has gained popularity in recent years, and it's used in many open-source projects. Many Linux distributions (Fedora, Ubuntu and others) have included DPDK support in their packaging systems as well.

The core DPDK ingredients are libraries and drivers, also known as Poll Mode Drivers (PMDs). There are more than 35 libraries at the time of this writing. These libraries abstract away the low-level implementation details, which provides flexibility as each vendor implements its own low-level layers.

The DPDK Development Model

DPDK is written mostly in C, but the project also has a few tools that are written in Python. All code contributions to DPDK are done by patches sent and discussed over the dpdk-dev mailing list. Patches aiming at getting feedback first are usually titled RFCs (Request For Comments). In order to keep the code as stable as possible, the preference is to preserve the ABI (Application Binary Interface) whenever possible. When it seems that there's no other choice, developers should follow a strict ABI deprecation process, including announcement of the requested ABI changes over the dpdk-dev mailing list ahead of time. The ABI changes that are approved and merged are documented in the Release Notes. When acceptance of new features is in doubt, but the respective patches are merged into the master tree anyway, they are tagged as "EXPERIMENTAL". This means that those patches may be changed or even could be removed without prior notice. Thus, for example, new rte_bus experimental APIs were added in DPDK 17.08. I also should note that usually whenever patches for a new, generic API (which should support multiple hardware devices from different vendors) are sent over the mailing list, it's expected that at least one hardware device that supports the new feature is available on the market (if the device is merely announced and not available, developers can't test it).

There's a technical board of nine members from various companies (Intel, NXP, 6WIND, Cavium and others). Meetings typically are held every two weeks over IRC, and the minutes of those meetings are posted on the dpdk-dev mailing list.

As with other large open-source projects, there are community-driven DPDK events across the globe on a regular basis every year. First, there are various DPDK Summits. Among them, DPDK Summit Userspace is focused on being more interactive and on getting feedback from the community. There also are several DPDK meetups in different locations around the world. Moreover, from time to time there is an online community survey, announced over the dpdk-dev mailing list, in order to get feedback from the community, and everyone can participate in it.

The DPDK website hosts the master DPDK repo, but several other repos are dedicated for new features. Several tools and utilities exist in the DPDK tree, among them are the dpdk-devbind.py script, which is for associating a network device or a crypto device with DPDK, and testpmd, which is a CLI tool for various tasks, such as forwarding, monitoring statistics and more. There are almost 50 sample applications under the "examples" folder, bundled with full detailed documentation.

Apart from DPDK itself, the DPDK site hosts several other open-source projects. One is the DPDK Test Suite (DTS), which a Python-based framework for DPDK. It has more than 100 test modules for various features, including the most advanced and most recent features. It runs with IXIA and Scapy traffic generators. It includes both functional and benchmarking tests, and it's very easy to configure and run, as you need to set up only three or four configuration files. You also can set the DPDK version with which you want to run it. DTS development is handled over a dedicated mailing list, and it currently has support for Intel NICs and Mallanox NICs.

DPDK is released every three months. This release cadence is designed to allow DPDK to keep evolving in a rapid pace while giving enough opportunity to review, discuss and improve the contributions. There are usually 3–5 release candidates (RCs) before the final release. For the 17.08 release, there were 1,023 patches from 125 authors, including patches from Intel, Cavium, 6WIND, NXP and others. The release numbers follow the Ubuntu versions convention. A Long Term Stable (LTS) release is maintained for two years. Plans for future LTS releases currently are being discussed in the DPDK community. The plan is to make every .11 release in an even-numbered year (16.11, 18.11 and so forth ) an LTS release and to maintain it for two years.

Recent Features and New Ideas

Several interesting features were added last year. One of the most fascinating capabilities (added in DPDK 17.05, with new features enabled in 17.08 and 17.11) is "Dynamic Device Personalization" (DDP) for the Intel I40E driver (10Gb/25Gb/40Gb). This feature allows applying a per-device profile to the I40E firmware dynamically. You can load a profile by running a testpmd CLI command (ddp add), and you can remove it with ddp del. You also can apply or remove profiles when traffic is flowing, with a small number of packets dropped during handling a profile. These profiles are created by Intel and not by customers, as I40E firmware programming requires deep knowledge of the I40E device internals.

Other features to mention include Bruce Richardson's build system patch, which provides a more efficient build system for DPDK with meson and ninja, a new kernel module called Kernel Control Path (KCP), port representors and more.

DPDK and Networking Projects

DPDK is used in various important networking projects. The list is quite long, but I want to mention a few of them briefly:

  • Open vSwitch (OvS): the OvS project implements a virtual network switch. It was transitioned to the Linux Foundation in August 2016 and gained a lot of popularity in the industry. DPDK was first integrated into OvS 2.2 in 2015. Later, in OvS 2.4, support for vHost user, which is a virtual device, was added. Support for advanced features like multi-queues and numa awareness was added in subsequent releases.
  • Contrail vRouter: Contrail Systems was a startup that developed SDN controllers. Juniper Networks acquired it in 2012, and Juniper Networks released the Contrail vRouter later as an open-source project. It uses DPDK to achieve better network performance.
  • pktgen-dpdk: an open-source traffic generator based on DPDK (hosted on the DPDK site).
  • TREX: a stateful and stateless open-source traffic generator based on DPDK.
  • Vector Packet Processing (VPP): an FD.io project.
Getting Started with DPDK

For those who are newcomers to DPDK, both users and developers, there is excellent documentation hosted on DPDK site. It's recommended that you actually try run several of the sample applications (following the "Sample Applications User Guides"), starting with the "Hello World" application. It's also a good idea to follow the dpdk-users mailing list on a regular basis. For those who are interested in development, the Programmer's Guide is a good source of information about the architecture and development environment, and developers should follow the dpdk-dev mailing list as well.

DPDK and SR-IOV Example

I want to conclude this article with a very basic example (based on SR-IOV) of how to create a DPDK VF and how to attach it to a VM with qemu. I also show how to create a non-DPDK VF ("kernel VF"), attach it to a VM, run a DPDK app on that VF and communicate with it from the host.

As a preparation step, you need to enable IOMMU and virtualization on the host. To support this, add intel_iommu=on iommu=pt as kernel parameters to the kernel command line (in grub.cfg), and also to enable virtualization and VT-d in the BIOS (VT-d stands for "Intel Virtualization Technology for Directed I/O"). You'll use the Intel I40E network interface card for this example. The I40E device driver supports up to 128 VFs per device, divided equally across ports, so if you have a quad-port I40E NIC, you can create up to 32 VFs on each port.

For this example, I also show a simple usage of the testpmd CLI, as mentioned earlier. This example is based on DPDK-17.08, the most recent release of DPDK at the time of this writing. In this example, you'll use Single Root I/O Virtualization (SR-IOV), which is an extension of the PCI Express (PCIe) specification and allows sharing a single physical PCI Express resource across several virtual environments. This technology is very popular in data-center/cloud environments, and many network adapters support this feature, and likewise, their drivers support this feature. I should note that SRIOV is not limited to network devices, but is available for other PCI devices as well, such as graphic cards.

DPDK VF

You create DPDK VFs by writing the number of requested VFs into a DPDK sysfs entry called max_vfs. Say that eth8 is the PF on top of which you want to create a VF and its PCI address is 0000:07:00.0. (You can fetch the PCI address with ethtool -i | grep bus-info.) The following is the sequence you run on the host in order to create a VF and launch a VM. First, bind the PF to DPDK with usertools/dpdk-devbind.py, for example:

modprobe uio insmod /build/kmod/igb_uio.k ./usertools/dpdk-devbind.py -b igb_uio 0000:07:00.0

Then, create two DPDK VFs with:

echo 2 > /sys/bus/pci/devices/0000:07:00.0/max_vfs

You can verify that the two VFs were created by this operation by checking whether two new entries were added when running: lspci | grep "Virtual Function", or by verifying that you have now two new symlinks under /sys/bus/pci/devices/0000:07:00.0/ for the two newly created VFs: virtfn0 and virtfn1.

Next, launch the VMs via qemu using PCI Passthrough, for example:

qemu-system-x86_64 -enable-kvm -cpu host \ -drive file=Ubuntu_1604.qcow2,index=0,media=disk,format=qcow2 \ -smp 5 -m 2048 -vga qxl \ -vnc :1 \ -device pci-assign,host=0000:07:02.0 \ -net nic,macaddr=00:00:00:99:99:01 \ -net tap,script=/etc/qemu-ifup.

Note: qemu-ifup is a shell script that's invoked when the VM is launched, usually for setting up networking.

Next, you can start a VNC client (such as RealVNC client) to access the VM, and from there, you can verify that the VF was indeed assigned to it, with lspci -n. You should see a single device, which has "8086 154c" as the vendor ID/device ID combination; "8086 154c" is the virtual function PCI ID of the I40E NIC. You can launch a DPDK application in the guest on top of that VF.

Kernel VF

To conclude this example, let's create a kernel VF on the host and run a DPDK on top of it in the VM, and then let's look at a simple interaction with the host PF.

First, create two kernel VFs with:

echo 2 > /sys/bus/pci/devices/0000:07:00.0/sriov_numvfs

Here again you can verify that these two VFs were created by running lspci | grep "Virtual Function".

Next, run this sequence:

echo "8086 154c" > /sys/bus/pci/drivers/pci-stub/new_id echo 07:02.0 > /sys/bus/pci/devices/$VF_PCI_0/driver/unbind echo 07:02.0 > /sys/bus/pci/drivers/pci-stub/bind

Then launch the VM the same way as before, with the same qemu-system-x86_64 command mentioned earlier. Again, in the guest, you should be able to see the I40E VF with lspci -n. On the host, doing ip link show will show the two VFs of eth8: vf 0 and vf 1. You can set the MAC addresses of a VF from the host with ip link set—for example:

ip link set eth8 vf 0 mac 00:11:22:33:44:55

Then, when you run a DPDK application like testpmd in the guest, and run, for example, show port info 0 from the testpmd CLI, you'll see that indeed the MAC address that you set in the host is reflected for that VF in DPDK.

Summary

This article provides a high-level overview of the DPDK project, which is growing dynamically and gaining popularity in the industry. The near future likely will bring support for more network interfaces from different vendors, as well as new features.

Rami Rosen

Weekend Reading: Networking

3 months ago
Weekend Reading: Networking Image Carlie Fairchild Sat, 04/21/2018 - 08:08 Networking Networking is one of Linux's strengths and a popular topic for our subscribers. For your weekend reading, we've curated some of Linux Journal's most popular networking articles. 

 

NTPsec: a Secure, Hardened NTP Implementation

by Eric S. Raymond

Network time synchronization—aligning your computer's clock to the same Universal Coordinated Time (UTC) that everyone else is using—is both necessary and a hard problem. Many internet protocols rely on being able to exchange UTC timestamps accurate to small tolerances, but the clock crystal in your computer drifts (its frequency varies by temperature), so it needs occasional adjustments.

 

smbclient Security for Windows Printing and File Transfer

by Charles Fisher

Microsoft Windows is usually a presence in most computing environments, and UNIX administrators likely will be forced to use resources in Windows networks from time to time. Although many are familiar with the Samba server software, the matching smbclient utility often escapes notice.

 

Understanding Firewalld in Multi-Zone Configurations

by Nathan R. Vance and William F. Polik

Stories of compromised servers and data theft fill today's news. It isn't difficult for someone who has read an informative blog post to access a system via a misconfigured service, take advantage of a recently exposed vulnerability or gain control using a stolen password. Any of the many internet services found on a typical Linux server could harbor a vulnerability that grants unauthorized access to the system.

 

Papa's Got a Brand New NAS

by Kyle Rankin

It used to be that the true sign you were dealing with a Linux geek was the pile of computers lying around that person's house. How else could you experiment with networked servers without a mass of computers and networking equipment? If you work as a sysadmin for a large company, sometimes one of the job perks is that you get first dibs on decommissioned equipment. Through the years, I was able to amass quite a home network by combining some things I bought myself with some equipment that was too old for production. A major point of pride in my own home network was the 24U server cabinet in the garage. It had a gigabit top-of-rack managed switch, a 2U UPS at the bottom, and in the middle was a 1U HP DL-series server with a 1U eSATA disk array attached to it. Above that was a slide-out LCD and keyboard in case I ever needed to work on the server directly.

 

Banana Backups

by Kyle Rankin

I wrote an article called "Papa's Got a Brand New NAS" where I described how I replaced my rackmounted gear with a small, low-powered ARM device—the Odroid XU4. Before I settled on that solution, I tried out a few others including a pair of Banana Pi computers—small single-board computers like Raspberry Pis only with gigabit networking and SATA2 controllers on board. In the end, I decided to go with a single higher-powered board and use a USB3 disk enclosure with RAID instead of building a cluster of Banana Pis that each had a single disk attached. Since I had two Banana Pis left over after this experiment, I decided to put them to use, so in this article, I describe how I turned one into a nice little backup server.

 

Roll Your Own Enterprise Wi-Fi

by Shawn Powers

The UniFi line of products from Ubiquiti is affordable and reliable, but the really awesome feature is its (free!) Web-based controller app. The only UniFi products I have are wireless access points, even though the company also has added switches, gateways and even VoIP products to the mix. Even with my limited selection of products, however, the Web controller makes designing and maintaining a wireless network not just easy, but fun!

 

Tracking Down Blips

by Shawn Powers

In a previous article, I explained the process for setting up Cacti, which is a great program for graphing just about anything. One of the main things I graph is my internet usage. And, it's great information to have, until there is internet activity you can't explain. In my case, there was a "blip" every 20 minutes or so that would use about 4mbps of bandwidth (Figure 1). In the grand scheme of things, it wasn't a big deal, because my connection is 60mbps down. Still, it was driving me crazy. I don't like the idea of something on my network doing things on the internet without my knowledge. So, the hunt began.

 

Carlie Fairchild

Caption This!

3 months ago
Caption This! Image Carlie Fairchild Fri, 04/20/2018 - 11:21 caption this Contest

Each month, we provide a cartoon in need of a caption. You submit your caption, we choose three finalists, and readers vote for their favorite. The winning caption for this month's cartoon will appear in the June issue of Linux Journal.

To enter, simply type in your caption in the comments below or email us, publisher@linuxjournal.com.

Carlie Fairchild

Mozilla's Common Voice Project, Red Hat Announces Vault Operator, VirtualBox 5.2.10 Released and More

3 months ago
Mozilla open source Apple Red Hat News VirtualBox Chrome OS Cloud Security Containers

News briefs April 20, 2018.

Participate in Mozilla's open-source Common Voice Project, an initiative to help teach machines how real people speak: "Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web." For more about the Common Voice Project, see the story on opensource.com.

Red Hat yesterday announced the Vault Operator, a new open-source project that "aims to make it easier to install, manage, and maintain instances of Vault—a tool designed for storing, managing, and controlling access to secrets, such as tokens, passwords, certificates, and API keys—on Kubernetes clusters."

Google might be working on implementing dual-boot functionality in Chrome OS to allow Chromebook users to boot multiple OSes. Softpedia News reports on a Reddit thread that references "Alt OS" in recent Chromium Gerrit commits. This is only speculation so far, and Google has not confirmed it is working on dual-boot support for Chrome OS on Chromebooks.

Oracle recently released VirtualBox 5.2.10. This release addresses the CPU (Critical Patch Updates) Advisory for April 2018 related to Oracle VM VirtualBox and several other improvements, including fixing a KDE Plasma hang and having multiple NVMe controllers with ICH9 enabled. See the Changelog for all the details.

Apple yesterday announced it has open-sourced its FoundationDB cloud database. Apple's goal is "to build a community around the project and make FoundationDB the foundation for the next generation of distributed databases". The project is now available on GitHub.

Jill Franklin

More L337 Translations

3 months ago
More L337 Translations Image Dave Taylor Thu, 04/19/2018 - 09:20 HOW-TOs Programming Shell Scripting

Dave continues with his shell-script L33t translator.

In my last article, I talked about the inside jargon of hackers and computer geeks known as "Leet Speak" or just "Leet". Of course, that's a shortened version of the word Elite, and it's best written as L33T or perhaps L337 to be ultimately kewl. But hey, I don't judge.

Last time I looked at a series of simple letter substitutions that allow you to convert a sentence like "I am a master hacker with great skills" into something like this:

I AM A M@ST3R H@XR WITH GR3@T SKILLZ

It turns out that I missed some nuances of Leet and didn't realize that most often the letter "a" is actually turned into a "4", not an "@", although as with just about everything about the jargon, it's somewhat random.

In fact, every single letter of the alphabet can be randomly tweaked and changed, sometimes from a single letter to a sequence of two or three symbols. For example, another variation on "a" is "/-\" (for what are hopefully visually obvious reasons).

Continuing in that vein, "B" can become "|3", "C" can become "[", "I" can become "1", and one of my favorites, "M" can change into "[]V[]". That's a lot of work, but since one of the goals is to have a language no one else understands, I get it.

There are additional substitutions: a word can have its trailing "S" replaced by a "Z", a trailing "ED" can become "'D" or just "D", and another interesting one is that words containing "and", "anned" or "ant" can have that sequence replaced by an ampersand (&).

Let's add all these L337 filters and see how the script is shaping up.

But First, Some Randomness

Since many of these transformations are going to have a random element, let's go ahead and produce a random number between 1–10 to figure out whether to do one or another action. That's easily done with the $RANDOM variable:

doit=$(( $RANDOM % 10 )) # random virtual coin flip

Now let's say that there's a 50% chance that a -ed suffix is going to change to "'D" and a 50% chance that it's just going to become "D", which is coded like this:

if [ $doit -ge 5 ] ; then word="$(echo $word | sed "s/ed$/d/")" else word="$(echo $word | sed "s/ed$/'d/")" fi

Let's add the additional transformations, but not do them every time. Let's give them a 70–90% chance of occurring, based on the transform itself. Here are a few examples:

if [ $doit -ge 3 ] ; then word="$(echo $word | sed "s/cks/x/g;s/cke/x/g")" fi if [ $doit -ge 4 ] ; then word="$(echo $word | sed "s/and/\&/g;s/anned/\&/g; s/ant/\&/g")" fi

And so, here's the second translation, a bit more sophisticated:

$ l33t.sh "banned? whatever. elite hacker, not scriptie." B&? WH4T3V3R. 3LIT3 H4XR, N0T SCRIPTI3.

Note that it hasn't realized that "elite" should become L337 or L33T, but since it is supposed to be rather random, let's just leave this script as is. Kk? Kewl.

If you want to expand it, an interesting programming problem is to break each word down into individual letters, then randomly change lowercase to uppercase or vice versa, so you get those great ransom-note-style WeiRD LeTtEr pHrASes.

Next time, I plan to move on, however, and look at the great command-line tool youtube-dl, exploring how to use it to download videos and even just the audio tracks as MP3 files.

Dave Taylor

Help Canonical Test GNOME Patches, Android Apps Illegally Tracking Kids, MySQL 8.0 Released and More

3 months ago
GNOME Desktop News Android Security Privacy MySQL KDE LibreOffice Cloud

News briefs for April 19, 2018.

Help Canonical test the GNOME desktop memory leak fixes in Ubuntu 18.04 LTS (Bionic Beaver) by downloading and installing the current daily ISO for your hardware from here: http://cdimage.ubuntu.com/daily-live/current/bionic-desktop-amd64.iso. Then download the patched version of gjs, install, reboot, and then just use your desktop normally. If performance seems impacted by the new packages, re-install from the ISO again, but don't install the new packages and see if things are better. See the Ubuntu Community page for more detailed instructions.

Thousands of Android apps downloaded from the Google Play store may be tracking kids' data illegally, according to a new study. NBC News reports: "Researchers at the University of California's International Computer Science Institute analyzed 5,855 of the most downloaded kids apps, concluding that most of them are 'are potentially in violation' of the Children's Online Privacy Protection Act 1998, or COPPA, a federal law making it illegal to collect personally identifiable data on children under 13."

MySQL 8.0 has been released. This new version "includes significant performance, security and developer productivity improvements enabling the next generation of web, mobile, embedded and Cloud applications." MySQL 8.0 features include MySQL document store, transactional data dictionary, SQL roles, default to utf8mb4 and more. See the white paper for all the details.

KDE announced this morning that KDE Applications 18.04.0 are now available. New features include improvements to panels in the Dolphin file manager; Wayland support for KDE's JuK music player; improvements to Gwenview, KDE's image viewer and organizer; and more.

Collabora Productivity, "the driving force behind putting LibreOffice in the cloud", announced a new release of its enterprise-ready cloud document suite—Collabora Online 3.2. The new release includes implemented chart creation, data validation in Calc, context menu spell-checking and more.

Jill Franklin

An Update on Linux Journal

3 months ago
An Update on Linux Journal Image Carlie Fairchild Wed, 04/18/2018 - 12:41 Linux Journal Subscribe Patron Advertise Write

So many of you have asked how to help Linux Journal continue to be published* for years to come.

First, keep the great ideas coming—we all want to continue making Linux Journal 2.0 something special, and we need this community to do it.

Second, subscribe or renew. Magazines have a built-in fundraising program: subscriptions. It's true that most magazines don't survive on subscription revenue alone, but having a strong subscriber base tells Linux Journal, prospective authors, and yes, advertisers, that there is a community of people who support and read the magazine each month.

Third, if you prefer reading articles on our website, consider becoming a Patron. We have different Patreon reward levels, one even gets your name immortalized in the pages of Linux Journal.

Fourth, spread the word within your company about corporate sponsorship of Linux Journal. We as a community reject tracking, but we explicitly invite high-value advertising that sponsors the magazine and values readers. This is new and unique in online publishing, and just one example of our pioneering work here at Linux Journal.  

Finally, write for us! We are always looking for new writers, especially now that we are publishing more articles more often.  

With all our gratitude,

Your friends at Linux Journal

 

*We'd be remiss to not acknowledge or thank Private Internet Access for saving the day and bringing Linux Journal back from the dead. They are incredibly supportive partners and sincerely, we can not thank them enough for keeping us going. At a certain point however, Linux Journal has to become sustainable on its own.

Carlie Fairchild

Rise of the Tomb Raider Comes to Linux Tomorrow, IoT Developers Survey, New Zulip Release and More

3 months ago
News gaming Chrome GIMP IOT openSUSE Distributions Desktop

News briefs for April 18, 2018.

Rise of the Tomb Raider: 20 Year Celebration comes to Linux tomorrow! A minisite dedicated to Rise of the Tomb Raider is available now from Feral Interactive, and you also can view the trailer on Feral's YouTube channel.

Zulip 1.8, the open-source team chat software, announces the release of Zulip Server 1.8. This is a huge release, with more than 3500 new commits since the last release in October 2017. Zulip "is an alternative to Slack, HipChat, and IRC. Zulip combines the immediacy of chat with the asynchronous efficiency of email-style threading, and is 100% free and open-source software".

The IoT Developers Survey 2018 is now available. The survey was sponsored by the Eclipse IoT Working Group, Agile IoT, IEEE and the Open Mobile Alliance "to better understand how developers are building IoT solutions". The survey covers what people are building, key IoT concerns, top IoT programming languages and distros, and more.

Google released Chrome 66 to its stable channel for desktop/mobile users. This release includes many security improvements as well as new JavaScript APIs. See the Chrome Platform Status site for details.

openSUSE Leap 15 is scheduled for release May 25, 2018. Leap 15 "shares a common core with SUSE Linux Enterprise (SLE) 15 sources and has thousands of community packages on top to meet the needs of professional and semi-professional users and their workloads."

GIMP 2.10.0 RC 2 has been released. This release fixes 44 bugs and introduces important performance improvements. See the complete list of changes here.

Jill Franklin

Create Dynamic Wallpaper with a Bash Script

3 months ago
Create Dynamic Wallpaper with a Bash Script Image Patrick Wheelan Wed, 04/18/2018 - 09:58 bash Desktop Programming

Harness the power of bash and learn how to scrape websites for exciting new images every morning.

So, you want a cool dynamic desktop wallpaper without dodgy programs and a million viruses? The good news is, this is Linux, and anything is possible. I started this project because I was bored of my standard OS desktop wallpaper, and I have slowly created a plethora of scripts to pull images from several sites and set them as my desktop background. It's a nice little addition to my day—being greeted by a different cat picture or a panorama of a country I didn't know existed. The great news is that it's easy to do, so let's get started.

Why Bash?

BAsh (The Bourne Again shell) is standard across almost all *NIX systems and provides a wide range of operations "out of the box", which would take time and copious lines of code to achieve in a conventional coding or even scripting language. Additionally, there's no need to re-invent the wheel. It's much easier to use somebody else's program to download webpages for example, than to deal with low-level system sockets in C.

How's It Going to Work?

The concept is simple. Choose a site with images you like and "scrape" the page for those images. Then once you have a direct link, you download them and set them as the desktop wallpaper using the display manager. Easy right?

A Simple Example: xkcd

To start off, let's venture to every programmer's second-favorite page after Stack Overflow: xkcd. Loading the page, you should be greeted by the daily comic strip and some other data.

Now, what if you want to see this comic without venturing to the xkcd site? You need a script to do it for you. First, you need to know how the webpage looks to the computer, so download it and take a look. To do this, use wget, an easy-to-use, commonly installed, non-interactive, network downloader. So, on the command line, call wget, and give it the link to the page:

user@LJ $: wget https://www.xkcd.com/ --2018-01-27 21:01:39-- https://www.xkcd.com/ Resolving www.xkcd.com... 151.101.0.67, 151.101.192.67, ↪151.101.64.67, ... Connecting to www.xkcd.com|151.101.0.67|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2606 (2.5K) [text/html] Saving to: 'index.html' index.html 100% [==========================================================>] 2.54K --.-KB/s in 0s 2018-01-27 21:01:39 (23.1 MB/s) - 'index.html' saved [6237]

As you can see in the output, the page has been saved to index.html in your current directory. Using your favourite editor, open it and take a look (I'm using nano for this example):

user@LJ $: nano index.html

Now you might realize, despite this being a rather bare page, there's a lot of code in that file. Instead of going through it all, let's use grep, which is perfect for this task. Its sole function is to print lines matching your search. Grep uses the syntax:

user@LJ $: grep [search] [file]

Looking at the daily comic, its current title is "Night Sky". Searching for "night" with grep yields the following results:

user@LJ $: grep "night" index.html Image URL (for hotlinking/embedding): ↪https://imgs.xkcd.com/comics/night_sky.png

The grep search has returned two image links in the file, each related to "night". Looking at those two lines, one is the image in the page, and the other is for hotlinking and is already a usable link. You'll be obtaining the first link, however, as it is more representative of other pages that don't provide an easy link, and it serves as a good introduction to the use of grep and cut.

To get the first link out of the page, you first need to identify it in the file programmatically. Let's try grep again, but this time instead of using a string you already know ("night"), let's approach as if you know nothing about the page. Although the link will be different, the HTML should remain the same; therefore, always should appear before the link you want:

user@LJ $: grep "img src=" index.html

It looks like there are three images on the page. Comparing these results from the first grep, you'll see that grep. The other two links contain "/s/"; whereas the link we want contains "/comics/". So, you need to grep the output of the last command for "/comics/". To pass along the output of the last command, use the pipe character (|):

user@LJ $: grep "img src=" index.html | grep "/comics/"

And, there's the line! Now you just need to separate the image link from the rest of it with the cut command. cut uses the syntax:

user@LJ $: cut [-d delimeter] [-f field] [-c characters]

To cut the link from the rest of the line, you'll want to cut next to the quotation mark and select the field before the next quotation mark. In other words, you want the text between the quotes, or the link, which is done like this:

user@LJ $: grep "img src=" index.html | grep "/comics/" | ↪cut -d\" -f2 //imgs.xkcd.com/comics/night_sky.png

And, you've got the link. But wait! What about those pesky forward slashes at the beginning? You can cut those out too:

user@LJ $: grep "img src=" index.html | grep "/comics/" | ↪cut -d\" -f 2 | cut -c 3- imgs.xkcd.com/comics/night_sky.png

Now you've just cut the first three characters from the line, and you're left with a link straight to the image. Using wget again, you can download the image:

user@LJ $: wget imgs.xkcd.com/comics/night_sky.png --2018-01-27 21:42:33-- http://imgs.xkcd.com/comics/night_sky.png Resolving imgs.xkcd.com... 151.101.16.67, 2a04:4e42:4::67 Connecting to imgs.xkcd.com|151.101.16.67|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 54636 (53K) [image/png] Saving to: 'night_sky.png' night_sky.png 100% [===========================================================>] 53.36K --.-KB/s in 0.04s 2018-01-27 21:42:33 (1.24 MB/s) - 'night_sky.png' ↪saved [54636/54636]

Now you have the image in your directory, but its name will change when the comic's name changes. To fix that, tell wget to save it with a specific name:

user@LJ $: wget "$(grep "img src=" index.html | grep "/comics/" ↪| cut -d\" -f2 | cut -c 3-)" -O wallpaper --2018-01-27 21:45:08-- http://imgs.xkcd.com/comics/night_sky.png Resolving imgs.xkcd.com... 151.101.16.67, 2a04:4e42:4::67 Connecting to imgs.xkcd.com|151.101.16.67|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 54636 (53K) [image/png] Saving to: 'wallpaper' wallpaper 100% [==========================================================>] 53.36K --.-KB/s in 0.04s 2018-01-27 21:45:08 (1.41 MB/s) - 'wallpaper' saved [54636/54636]

The -O option means that the downloaded image now has been saved as "wallpaper". Now that you know the name of the image, you can set it as a wallpaper. This varies depending upon which display manager you're using. The most popular are listed below, assuming the image is located at /home/user/wallpaper.

GNOME:

gsettings set org.gnome.desktop.background picture-uri ↪"File:///home/user/wallpaper" gsettings set org.gnome.desktop.background picture-options ↪scaled

Cinnamon:

gsettings set org.cinnamon.desktop.background picture-uri ↪"file:///home/user/wallpaper" gsettings set org.cinnamon.desktop.background picture-options ↪scaled

Xfce:

xfconf-query --channel xfce4-desktop --property ↪/backdrop/screen0/monitor0/image-path --set ↪/home/user/wallpaper

You can set your wallpaper now, but you need different images to mix in. Looking at the webpage, there's a "random" button that takes you to a random comic. Searching with grep for "random" returns the following:

user@LJ $: grep random index.html
  • Random
  • Random
  • This is the link to a random comic, and downloading it with wget and reading the result, it looks like the initial comic page. Success!

    Now that you've got all the components, let's put them together into a script, replacing www.xkcd.com with the new c.xkcd.com/random/comic/:

    #!/bin/bash wget c.xkcd.com/random/comic/ wget "$(grep "img src=" index.html | grep /comics/ | cut -d\" ↪-f 2 | cut -c 3-)" -O wallpaper gsettings set org.gnome.desktop.background picture-uri ↪"File:///home/user/wallpaper" gsettings set org.gnome.desktop.background picture-options ↪scaled

    All of this should be familiar except the first line, which designates this as a bash script, and the second wget command. To capture the output of commands into a variable, you use $(). In this case, you're capturing the grepping and cutting process—capturing the final link and then downloading it with wget. When the script is run, the commands inside the bracket are all run producing the image link before wget is called to download it.

    There you have it—a simple example of a dynamic wallpaper that you can run anytime you want.

    If you want the script to run automatically, you can add a cron job to have cron run it for you. So, edit your cron tab with:

    user@LJ $: crontab -e

    My script is called "xkcd", and my crontab entry looks like this:

    @reboot /bin/bash /home/user/xkcd

    This will run the script (located at /home/user/xkcd) using bash, every restart.

    Reddit

    The script above shows how to search for images in HTML code and download them. But, you can apply this to any website of your choice—although the HTML code will be different, the underlying concepts remain the same. With that in mind, let's tackle downloading images from Reddit. Why Reddit? Reddit is possibly the largest blog on the internet and the third-most-popular site in the US. It aggregates content from many different communities together onto one site. It does this through use of "subreddits", communities that join together to form Reddit. For the purposes of this article, let's focus on subreddits (or "subs" for short) that primarily deal with images. However, any subreddit, as long as it allows images, can be used in this script.

    Figure 1. Scraping the Web Made Simple—Analysing Web Pages in a Terminal

    Diving In

    Just like the xkcd script, you need to download the web page from a subreddit to analyse it. I'm using reddit.com/r/wallpapers for this example. First, check for images in the HTML:

    user@LJ $: wget https://www.reddit.com/r/wallpapers/ && grep ↪"img src=" index.html --2018-01-28 20:13:39-- https://www.reddit.com/r/wallpapers/ Resolving www.reddit.com... 151.101.17.140 Connecting to www.reddit.com|151.101.17.140|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 27324 (27K) [text/html] Saving to: 'index.html' index.html 100% [==========================================================>] 26.68K --.-KB/s in 0.1s 2018-01-28 20:13:40 (270 KB/s) - 'index.html' saved [169355] a community ↪for 9 years ↪....Forever and ever...... --- SNIP ---

    All the images have been returned in one long line, because the HTML for the images is also in one long line. You need to split this one long line into the separate image links. Enter Regex.

    Regex is short for regular expression, a system used by many programs to allow users to match an expression to a string. It contains wild cards, which are special characters that match certain characters. For example, the * character will match every character. For this example, you want an expression that matches every link in the HTML file. All HTML links have one string in common. They all take the form href="LINK". Let's write a regex expression to match:

    href="([^"#]+)"

    Now let's break it down:

    • href=" — simply states that the first characters should match these.

    • () — forms a capture group.

    • [^] — forms a negated set. The string shouldn't match any of the characters inside.

    • + — the string should match one or more of the preceding tokens.

    Altogether the regex matches a string that begins href=", doesn't contain any quotation marks or hashtags and finishes with a quotation mark.

    This regex can be used with grep like this:

    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html href="/static/opensearch.xml" href="https://www.reddit.com/r/wallpapers/" href="//out.reddit.com" href="//out.reddit.com" href="//www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-57x57.png" --- SNIP ---

    The -e options allow for extended regex options, and the -o switch means grep will print only patterns exactly matching and not the whole line. You now have a much more manageable list of links. From there, you can use the same techniques from the first script to extract the links and filter for images. This looks like the following:

    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html | cut -d'"' ↪-f2 | sort | uniq | grep -E '.jpg|.png' https://i.imgur.com/6DO2uqT.png https://i.imgur.com/Ualn765.png https://i.imgur.com/UO5ck0M.jpg https://i.redd.it/s8ngtz6xtnc01.jpg //www.redditstatic.com/desktop2x/img/favicon/ ↪android-icon-192x192.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-114x114.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-120x120.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-144x144.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-152x152.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-180x180.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-57x57.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-60x60.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-72x72.png //www.redditstatic.com/desktop2x/img/favicon/ ↪apple-icon-76x76.png //www.redditstatic.com/desktop2x/img/favicon/ ↪favicon-16x16.png //www.redditstatic.com/desktop2x/img/favicon/ ↪favicon-32x32.png //www.redditstatic.com/desktop2x/img/favicon/ ↪favicon-96x96.png

    The final grep uses regex again to match .jpg or .png. The | character acts as a boolean OR operator.

    As you can see, there are four matches for actual images: two .jpgs and two .pngs. The others are Reddit default images, like the logo. Once you remove those images, you'll have a final list of images to set as a wallpaper. The easiest way to remove these images from the list is with sed:

    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html | cut -d'"' ↪-f2 | sort | uniq | grep -E '.jpg|.png' | sed /redditstatic/d https://i.imgur.com/6DO2uqT.png https://i.imgur.com/Ualn765.png https://i.imgur.com/UO5ck0M.jpg https://i.redd.it/s8ngtz6xtnc01.jpg

    sed works by matching what's between the two forward slashes. The d on the end tells sed to delete the lines that match the pattern, leaving the image links.

    The great thing about sourcing images from Reddit is that every subreddit contains nearly identical HTML; therefore, this small script will work on any subreddit.

    Creating a Script

    To create a script for Reddit, it should be possible to choose from which subreddits you'd like to source images. I've created a directory for my script and placed a file called "links" in the directory with it. This file contains the subreddit links in the following format:

    https://www.reddit.com/r/wallpapers https://www.reddit.com/r/wallpaper https://www.reddit.com/r/NationalPark https://www.reddit.com/r/tiltshift https://www.reddit.com/r/pic

    At run time, I have the script read the list and download these subreddits before stripping images from them.

    Since you can have only one image at a time as desktop wallpaper, you'll want to narrow down the selection of images to just one. First, however, it's best to have a wide range of images without using a lot of bandwidth. So you'll want to download the web pages for multiple subreddits and strip the image links but not download the images themselves. Then you'll use a random selector to select one image link and download that one to use as a wallpaper.

    Finally, if you're downloading lots of subreddit's web pages, the script will become very slow. This is because the script waits for each command to complete before proceeding. To circumvent this, you can fork a command by appending an ampersand (&) character. This creates a new process for the command, "forking" it from the main process (the script).

    Here's my fully annotated script:

    #!/bin/bash DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" ↪# Get the script's current directory linksFile="links" mkdir $DIR/downloads cd $DIR/downloads # Strip the image links from the html function parse { grep -o -E 'href="([^"#]+)"' $1 | cut -d'"' -f2 | sort | uniq ↪| grep -E '.jpg|.png' >> temp grep -o -E 'href="([^"#]+)"' $2 | cut -d'"' -f2 | sort | uniq ↪| grep -E '.jpg|.png' >> temp grep -o -E 'href="([^"#]+)"' $3 | cut -d'"' -f2 | sort | uniq ↪| grep -E '.jpg|.png' >> temp grep -o -E 'href="([^"#]+)"' $4 | cut -d'"' -f2 | sort | uniq ↪| grep -E '.jpg|.png' >> temp } # Download the subreddit's webpages function download { rname=$( echo $1 | cut -d / -f 5 ) tname=$(echo t.$rname) rrname=$(echo r.$rname) cname=$(echo c.$rname) wget --load-cookies=../cookies.txt -O $rname $1 ↪&>/dev/null & wget --load-cookies=../cookies.txt -O $tname $1/top ↪&>/dev/null & wget --load-cookies=../cookies.txt -O $rrname $1/rising ↪&>/dev/null & wget --load-cookies=../cookies.txt -O $cname $1/controversial ↪&>/dev/null & wait # wait for all forked wget processes to return parse $rname $tname $rrname $cname } # For each line in links file while read l; do if [[ $l != *"#"* ]]; then # if line doesn't contain a ↪hashtag (comment) download $l& fi done < ../$linksFile wait # wait for all forked processes to return sed -i '/www.redditstatic.com/d' temp # remove reddit pics that ↪exist on most pages from the list wallpaper=$(shuf -n 1 temp) # select randomly from file and DL echo $wallpaper >> $DIR/log # save image into log in case ↪we want it later wget -b $wallpaper -O $DIR/wallpaperpic 1>/dev/null # Download ↪wallpaper image gsettings set org.gnome.desktop.background picture-uri ↪file://$DIR/wallpaperpic # Set wallpaper (Gnome only!) rm -r $DIR/downloads # cleanup

    Just like before, you can set up a cron job to run the script for you at every reboot or whatever interval you like.

    And, there you have it—a fully functional cat-image harvester. May your morning logins be greeted with many furry faces. Now go forth and discover new subreddits to gawk at and new websites to scrape for cool wallpapers.

    Patrick Wheelan

    Cooking With Linux (without a net): A CMS Smorgasbord

    3 months ago

    Please support Linux Journal by subscribing or becoming a patron.

    Note : You are watching a recording of a live show. It's Tuesday and that means it's time for Cooking With Linux (without a net), sponsored and supported by Linux Journal. Today, I'm going to install four popular content management systems. These will be Drupal, Joomla, Wordpress, and Backdrop. If you're trying to decide on what your next CMS platform should be, this would be a great time to tune in. And yes, I'll do it all live, without a net, and with a high probability of falling flat on my face. Join me today, at 12 noon, Easter Time. Be part of the conversation.

    Content management systems covered include:

    Drupal WordPress CMS Web Development
    Marcel Gagné

    The Agony and the Ecstasy of Cloud Billing

    3 months ago
    The Agony and the Ecstasy of Cloud Billing Image Corey Quinn Tue, 04/17/2018 - 09:40 AWS Cloud

    Cloud billing is inherently complex; it's not just you.

    Back in the mists of antiquity when I started reading Linux Journal, figuring out what an infrastructure was going to cost was (although still obnoxious in some ways) straightforward. You'd sign leases with colocation providers, buy hardware that you'd depreciate on a schedule and strike a deal in blood with a bandwidth provider, and you were more or less set until something significant happened to your scale.

    In today's brave new cloud world, all of that goes out the window. The public cloud providers give with one hand ("Have a full copy of any environment you want, paid by the hour!"), while taking with the other ("A single Linux instance will cost you $X per hour, $Y per GB transferred per month, and $Z for the attached storage; we simplify this pricing into what we like to call 'We Make It Up As We Go Along'").

    In my day job, I'm a consultant who focuses purely on analyzing and reducing the Amazon Web Services (AWS) bill. As a result, I've seen a lot of environments doing different things: cloud-native shops spinning things up without governance, large enterprises transitioning into the public cloud with legacy applications that don't exactly support that model without some serious tweaking, and cloud migration projects that somehow lost their way severely enough that they were declared acceptable as they were, and the "multi-cloud" label was slapped on to them. Throughout all of this, some themes definitely have emerged that I find that people don't intuitively grasp at first. To wit:

    • It's relatively straightforward to do the basic arithmetic to figure out what a current data center would cost to put into the cloud as is—generally it's a lot! If you do a 1:1 mapping of your existing data center into the cloudy equivalents, it invariably will cost more; that's a given. The real cost savings arise when you start to take advantage of cloud capabilities—your web server farm doesn't need to have 50 instances at all times. If that's your burst load, maybe you can scale that in when traffic is low to five instances or so? Only once you fall into a pattern (and your applications support it!) of paying only for what you need when you need it do the cost savings of cloud become apparent.

    • One of the most misunderstood aspects of Cloud Economics is the proper calculation of Total Cost of Ownership, or TCO. If you want to do a break-even analysis on whether it makes sense to build out a storage system instead of using S3, you've got to include a lot more than just a pile of disks. You've got to factor in disaster recovery equipment and location, software to handle replication of data, staff to run the data center/replace drives, the bandwidth to get to the storage from where it's needed, the capacity planning for future growth—and the opportunity cost of building that out instead of focusing on product features.

    • It's easy to get lost in the byzantine world of cloud billing dimensions and lose sight of the fact that you've got staffing expenses. I've yet to see a company with more than five employees wherein the cloud expense wasn't dwarfed by payroll. Unlike the toy projects some of us do as labors of love, engineering time costs a lot of money. Retraining existing staff to embrace a cloud future takes time, and not everyone takes to this new paradigm quickly.

    • Accounting is going to have to weigh in on this, and if you're not prepared for that conversation, it's likely to be unpleasant. You're going from an old world where you could plan your computing expenses a few years out and be pretty close to accurate. Cloud replaces that with a host of variables to account for, including variable costs depending upon load, amortization of Reserved Instances, provider price cuts and a complete lack of transparency with regard to where the money is actually going (Dev or Prod? Which product? Which team spun that up? An engineer left the company six months ago, but their 500TB of data is still sitting there and so on).

    The worst part is that all of this isn't apparent to newcomers to cloud billing, so when you trip over these edge cases, it's natural to feel as if the problem is somehow your fault. I do this for a living, and I was stymied trying to figure out what data transfer was likely to cost in AWS. I started drawing out how it's billed to customers, and ultimately came up with the "AWS Data Transfer Costs" diagram shown in Figure 1.

    Figure 1. A convoluted mapping of how AWS data transfer is priced out.

    If you can memorize those figures, you're better at this than I am by a landslide! It isn't straightforward, it's not simple, and it's certainly not your fault if you don't somehow intrinsically know these things.

    That said, help is at hand. AWS billing is getting much more understandable, with the advent of such things as free Reserved Instance recommendations, the release of the Cost Explorer API and the rise of serverless technologies. For their part, Google's GCP and Microsoft's Azure learned from the early billing stumbles of AWS, and as a result, both have much more understandable cost structures. Additionally, there are a host of cost visibility Platform as a Service offerings out there; they all do more or less the same things as one another, but they're great for ad-hoc queries around your bill. If you'd rather build something you can control yourself, you can shove your billing information from all providers into an SQL database and run something like QuickSight or Tableau on top of it to aide visualization, as many shops do today.

    In return for this ridiculous pile of complexity, you get something rather special—the ability to spin up resources on-demand, for as little time as you need them, and pay only for the things that you use. It's incredible as a learning resource alone—imagine how much simpler it would have been in the late 1990s to receive a working Linux VM instead of having to struggle with Slackware's installation for the better part of a week. The cloud takes away, but it also gives.

    Corey Quinn

    Microsoft Announces First Custom Linux Kernel, German Government Chooses Open-Source Nextcloud and More

    3 months ago
    Microsoft News kernel Cloud open source Events gaming

    News briefs for April 17, 2018.

    Microsoft yesterday introduced Azure Sphere, a Linux-based OS and cloud service for securing IoT devices. According to ZDNet, "Microsoft President Brad Smith introduced Azure Sphere saying, 'After 43 years, this is the first day that we are announcing, and will distribute, a custom Linux kernel.'"

    The German government's Federal Information Technology Centre (ITZBund) has chosen open-source Nextcloud for its self-hosted cloud solution, iwire reports. Nextcloud was chosen for its strict security requirements and scalability "both in terms of large numbers of uses and extensibility with additional features".

    European authorities have effectively ended the Whois public database of domain name registration, which ICANN oversees. According to The Register, the service isn't compliant with the GDPR and will be illegal as of May 25th: "ICANN now has a little over a month to come up with a replacement to the decades-old service that covers millions of domain names and lists the personal contact details of domain registrants, including their name, email and telephone number."

    A new release of PySoIFC, a free and open-source collection of more than 1,000 card Solitaire and Mahjong games, was announced recently. The new stable release, 2.2.0, is the first since 2009.

    Deadline for proposals to speak at Open Source Summit North America is April 29. OSSN is being held in Vancouver, BC, this year from August 29–31.

    In other event news, Red Hat today announced the keynote speakers and agenda for its largest ever Red Hat Summit being held at the Moscone Center in San Francisco, May 8–10.

    Jill Franklin

    Bassel Khartabil Free Fellowship, GNOME 3.28.1 Release, New Version of Mixxx and More

    3 months ago
    Creative Commons GNOME Subversion multimedia Monitoring

    News briefs for April 16, 2018.

    The Bassel Khartabil Free Fellowship was awarded yesterday to Majd Al-shihabi, a Palestinian-Syrian engineer and urban planning graduate based in Beirut, Lebanon: "The Fellowship will support Majd's efforts in building a unified platform for Syrian and Palestinian oral history archives, as well as the digitizing and release of previously forgotten 1940s era public domain maps of Palestine." The Creative Commons also announced the first three winners of the Bassel Khartabil Memorial Fund: Egypt-based The Mosireen Collective, and Lebanon-based Sharq.org and ASI-REM/ADEF Lebanon. For all the details, see the announcement on the Creative Commons website.

    GNOME 3.28 is ready for prime time after receiving its first point release on Friday, which includes numerous improvements and bug fixes. See the announcement for all the details on version 3.28.1.

    Apache Subversion 1.10 has been released. This version is "a superset of all previous Subversion releases, and is as of the time of its release considered the current "best" release. Any feature or bugfix in 1.0.x through 1.9.x is also in 1.10, but 1.10 contains features and bugfixes not present in any earlier release. The new features will eventually be documented in a 1.10 version of the free Subversion book." New features include improved path-based authorization, new interactive conflict resolver, added support for LZ4 compression and more. See the release notes for more information.

    A new version of Mixxx, the free and open-source DJ software, was released today. Version 2.1 has "new and improved controller mappings, updated Deere and LateNight skins, overhauled effects system, and much more".

    Kayenta, a new open-source project from Google and Netflix for automated deployment monitoring was announced recently. GeekWire reports that the project's goal is "to help other companies that want to modernize their application deployment practices but don't exactly have the same budget and expertise to build their own solution."

    Jill Franklin

    Multiprocessing in Python

    3 months ago
    Multiprocessing in Python Image Reuven M. Lerner Mon, 04/16/2018 - 09:20 HOW-TOs Programming python

    Python's "multiprocessing" module feels like threads, but actually launches processes.

    Many people, when they start to work with Python, are excited to hear that the language supports threading. And, as I've discussed in previous articles, Python does indeed support native-level threads with an easy-to-use and convenient interface.

    However, there is a downside to these threads—namely the global interpreter lock (GIL), which ensures that only one thread runs at a time. Because a thread cedes the GIL whenever it uses I/O, this means that although threads are a bad idea in CPU-bound Python programs, they're a good idea when you're dealing with I/O.

    But even when you're using lots of I/O, you might prefer to take full advantage of a multicore system. And in the world of Python, that means using processes.

    In my article "Launching External Processes in Python", I described how you can launch processes from within a Python program, but those examples all demonstrated that you can launch a program in an external process. Normally, when people talk about processes, they work much like they do with threads, but are even more independent (and with more overhead, as well).

    So, it's something of a dilemma: do you launch easy-to-use threads, even though they don't really run in parallel? Or, do you launch new processes, over which you have little control?

    The answer is somewhere in the middle. The Python standard library comes with "multiprocessing", a module that gives the feeling of working with threads, but that actually works with processes.

    So in this article, I look at the "multiprocessing" library and describe some of the basic things it can do.

    Multiprocessing Basics

    The "multiprocessing" module is designed to look and feel like the "threading" module, and it largely succeeds in doing so. For example, the following is a simple example of a multithreaded program:

    #!/usr/bin/env python3 import threading import time import random def hello(n): time.sleep(random.randint(1,3)) print("[{0}] Hello!".format(n)) for i in range(10): threading.Thread(target=hello, args=(i,)).start() print("Done!")

    In this example, there is a function (hello) that prints "Hello!" along with whatever argument is passed. It then runs a for loop that runs hello ten times, each of them in an independent thread.

    But wait. Before the function prints its output, it first sleeps for a few seconds. When you run this program, you then end up with output that demonstrates how the threads are running in parallel, and not necessarily in the order they are invoked:

    $ ./thread1.py Done! [2] Hello! [0] Hello! [3] Hello! [6] Hello! [9] Hello! [1] Hello! [5] Hello! [8] Hello! [4] Hello! [7] Hello!

    If you want to be sure that "Done!" is printed after all the threads have finished running, you can use join. To do that, you need to grab each instance of threading.Thread, put it in a list, and then invoke join on each thread:

    #!/usr/bin/env python3 import threading import time import random def hello(n): time.sleep(random.randint(1,3)) print("[{0}] Hello!".format(n)) threads = [ ] for i in range(10): t = threading.Thread(target=hello, args=(i,)) threads.append(t) t.start() for one_thread in threads: one_thread.join() print("Done!")

    The only difference in this version is it puts the thread object in a list ("threads") and then iterates over that list, joining them one by one.

    But wait a second—I promised that I'd talk about "multiprocessing", not threading. What gives?

    Well, "multiprocessing" was designed to give the feeling of working with threads. This is so true that I basically can do some search-and-replace on the program I just presented:

    • threading → multiprocessing
    • Thread → Process
    • threads → processes
    • thread → process

    The result is as follows:

    #!/usr/bin/env python3 import multiprocessing import time import random def hello(n): time.sleep(random.randint(1,3)) print("[{0}] Hello!".format(n)) processes = [ ] for i in range(10): t = multiprocessing.Process(target=hello, args=(i,)) processes.append(t) t.start() for one_process in processes: one_process.join() print("Done!")

    In other words, you can run a function in a new process, with full concurrency and take advantage of multiple cores, with multiprocessing.Process. It works very much like a thread, including the use of join on the Process objects you create. Each instance of Process represents a process running on the computer, which you can see using ps, and which you can (in theory) stop with kill.

    What's the Difference?

    What's amazing to me is that the API is almost identical, and yet two very different things are happening behind the scenes. Let me try to make the distinction clearer with another pair of examples.

    Perhaps the biggest difference, at least to anyone programming with threads and processes, is the fact that threads share global variables. By contrast, separate processes are completely separate; one process cannot affect another's variables. (In a future article, I plan to look at how to get around that.)

    Here's a simple example of how a function running in a thread can modify a global variable (note that what I'm doing here is to prove a point; if you really want to modify global variables from within a thread, you should use a lock):

    #!/usr/bin/env python3 import threading import time import random mylist = [ ] def hello(n): time.sleep(random.randint(1,3)) mylist.append(threading.get_ident()) # bad in real code! print("[{0}] Hello!".format(n)) threads = [ ] for i in range(10): t = threading.Thread(target=hello, args=(i,)) threads.append(t) t.start() for one_thread in threads: one_thread.join() print("Done!") print(len(mylist)) print(mylist)

    The program is basically unchanged, except that it defines a new, empty list (mylist) at the top. The function appends its ID to that list and then returns.

    Now, the way that I'm doing this isn't so wise, because Python data structures aren't thread-safe, and appending to a list from within multiple threads eventually will catch up with you. But the point here isn't to demonstrate threads, but rather to contrast them with processes.

    When I run the above code, I get:

    $ ./th-update-list.py [0] Hello! [2] Hello! [6] Hello! [3] Hello! [1] Hello! [4] Hello! [5] Hello! [7] Hello! [8] Hello! [9] Hello! Done! 10 [123145344081920, 123145354592256, 123145375612928, ↪123145359847424, 123145349337088, 123145365102592, ↪123145370357760, 123145380868096, 123145386123264, ↪123145391378432]

    So, you can see that the global variable mylist is shared by the threads, and that when one thread modifies the list, that change is visible to all the other threads.

    But if you change the program to use "multiprocessing", the output looks a bit different:

    #!/usr/bin/env python3 import multiprocessing import time import random import os mylist = [ ] def hello(n): time.sleep(random.randint(1,3)) mylist.append(os.getpid()) print("[{0}] Hello!".format(n)) processes = [ ] for i in range(10): t = multiprocessing.Process(target=hello, args=(i,)) processes.append(t) t.start() for one_process in processes: one_process.join() print("Done!") print(len(mylist)) print(mylist)

    Aside from the switch to multiprocessing, the biggest change in this version of the program is the use of os.getpid to get the current process ID.

    The output from this program is as follows:

    $ ./proc-update-list.py [0] Hello! [4] Hello! [7] Hello! [8] Hello! [2] Hello! [5] Hello! [6] Hello! [9] Hello! [1] Hello! [3] Hello! Done! 0 []

    Everything seems great until the end when it checks the value of mylist. What happened to it? Didn't the program append to it?

    Sort of. The thing is, there is no "it" in this program. Each time it creates a new process with "multiprocessing", each process has its own value of the global mylist list. Each process thus adds to its own list, which goes away when the processes are joined.

    This means the call to mylist.append succeeds, but it succeeds in ten different processes. When the function returns from executing in its own process, there is no trace left of the list from that process. The only mylist variable in the main process remains empty, because no one ever appended to it.

    Queues to the Rescue

    In the world of threaded programs, even when you're able to append to the global mylist variable, you shouldn't do it. That's because Python's data structures aren't thread-safe. Indeed, only one data structure is guaranteed to be thread safe—the Queue class in the multiprocessing module.

    Queues are FIFOs (that is, "first in, first out"). Whoever wants to add data to a queue invokes the put method on the queue. And whoever wants to retrieve data from a queue uses the get command.

    Now, queues in the world of multithreaded programs prevent issues having to do with thread safety. But in the world of multiprocessing, queues allow you to bridge the gap among your processes, sending data back to the main process. For example:

    #!/usr/bin/env python3 import multiprocessing import time import random import os from multiprocessing import Queue q = Queue() def hello(n): time.sleep(random.randint(1,3)) q.put(os.getpid()) print("[{0}] Hello!".format(n)) processes = [ ] for i in range(10): t = multiprocessing.Process(target=hello, args=(i,)) processes.append(t) t.start() for one_process in processes: one_process.join() mylist = [ ] while not q.empty(): mylist.append(q.get()) print("Done!") print(len(mylist)) print(mylist)

    In this version of the program, I don't create mylist until late in the game. However, I create an instance of multiprocessing.Queue very early on. That Queue instance is designed to be shared across the different processes. Moreover, it can handle any type of Python data that can be stored using "pickle", which basically means any data structure.

    In the hello function, it replaces the call to mylist.append with one to q.put, placing the current process' ID number on the queue. Each of the ten processes it creates will add its own PID to the queue.

    Note that this program takes place in stages. First it launches ten processes, then they all do their work in parallel, and then it waits for them to complete (with join), so that it can process the results. It pulls data off the queue, puts it onto mylist, and then performs some calculations on the data it has retrieved.

    The implementation of queues is so smooth and easy to work with, it's easy to forget that these queues are using some serious behind-the-scenes operating system magic to keep things coordinated. It's easy to think that you're working with threading, but that's just the point of multiprocessing; it might feel like threads, but each process runs separately. This gives you true concurrency within your program, something threads cannot do.

    Conclusion

    Threading is easy to work with, but threads don't truly execute in parallel. Multiprocessing is a module that provides an API that's almost identical to that of threads. This doesn't paper over all of the differences, but it goes a long way toward making sure things aren't out of control.

    Reuven M. Lerner

    FOSS Project Spotlight: Ravada

    3 months 1 week ago
    FOSS Project Spotlight: Ravada Image Francesc Guasch Fri, 04/13/2018 - 13:48 Desktop FOSS KVM Virtualization

    Ravada is an open-source project that allows users to connect to a virtual desktop.

    Currently, it supports KVM, but its back end has been designed and implemented in order to allow future hypervisors to be added to the framework. The client's only requirements are a web-browser and a remote viewer supporting the spice protocol.

    Ravada's main features include:

    • KVM back end.
    • LDAP and SQL authentication.
    • Kiosk mode.
    • Remote access for Windows and Linux.
    • Light and fast virtual machine clones for each user.
    • Instant clone creation.
    • USB redirection.
    • Easy and customizable end-user interface (i18n, l10n).
    • Administration from a web browser.

    It's very easy to install and use. Following the documentation, virtual machines can be deployed in minutes. It's an early release, but it's already used in production. The project is open source, and you can download the code from GitHub. Contributions welcome!

    Francesc Guasch

    Elisa Music Player Debuts, Zenroom Crypto-Language VM Reaches Version 0.5.0 and More

    3 months 1 week ago
    KDE multimedia Security ZFS System76 GNOME News

    News briefs for April 13, 2018.

    The Elisa music player, developed by the KDE community, debuted yesterday, with version 0.1. Elisa has good integration wtih the Plasma desktop and also supports other Linux desktop environments, as well as Windows and Android. In addition, the Elisa release announcement notes, "We are creating a reliable product that is a joy to use and respects our users' privacy. As such, we will prefer to support online services where users are in control of their data."

    Mozilla released Firefox 11.0 for iOS yesterday, and this new version turns on tracking protection by default. The feature uses a list provided by Disconnect to identify trackers, and it also provides options for turning it on or off overall or for specific websites.

    The Zenroom project, a brand-new crypto-language virtual machine, has reached version 0.5.0. Zenroom's goal is "improving people's awareness of how their data is processed by algorithms, as well facilitate the work of developers to create and publish algorithms that can be used both client and server side." In addition, it "has no external dependencies, is smaller than 1MB, runs in less than 64KiB memory and is ready for experimental use on many target platforms: desktop, embedded, mobile, cloud and browsers." The program is free software and is licensed under the GNU LGPL v3. Its main use case is "distributed computing of untrusted code where advanced cryptographic functions are required".

    ZFS On Linux, recently in the news for data-loss issues, may finally be getting SSD TRIM support, which has been in the works for years, according to Phoronix.

    System76 recently became a GNOME Foundation Advisory Board member. Neil McGovern, Executive Director of the GNOME Foundation, commented "System76's long-term ambition to see free software grow is highly commendable, and we're extremely pleased that they're coming on board to help support the Foundation and the community." See the betanews article for more details.

    Jill Franklin