ETL at a glance

Business Intelligence, at its core, is about taking data from different sources, compiling the data into a format that can be used, and running reports on that data. The process of compiling the data is called extract, transform load; ETL for short. If you are going to work with Business Intelligence, you need to have, at the very least, a rudimentary understanding of ETL. Here’s a quick overview:

Extract means connecting to all the data sources you have, such as payroll, accounting, inventory, sales etc. (basically any system or data source that can provide you with base data for statistic analysis), and extracting the raw data.

Transform means adapting the data from their disparate formats to one where the formats are the same between datasets, to fit the needs of the business.

Load means loading the transformed data into a system designed to run reports on the data, such as a data mart or a data warehouse.

Once the data has gone through ETL, you can either run analysis to get reports from the data, or run it through ETL again to load it into a different data store.

Starting out with OBIEE

Here it is. The reason I chose Oracle Linux over all the other distros out there. Over Debian-based Ubuntu, with which I am somewhat familiar, and over Fedora-based Red Hat, which is commonly used in enterprise environments. The reason I chose Oracle Linux, is that I want to learn more about Oracle Business Intelligence Enterprise Edition, OBIEE.

The learning curve with configuring OBIEE is significant, to say the least, which is why I recommend running it under Oracle Linux on a Virtual Machine, snapshotted at regular intervals. That way, when you run into a roadblock, you can make a note of what went wrong, revert to the last snapshot and avoid that particular roadblock the next time around.

Over the coming weeks and months, I will be chronicling my journey through getting OBIEE set up. Don’t worry; there’ll be other stuff in there as well. For now, here are a few links that I have found useful:

One important difference between Debian and Fedora

Before starting out with Oracle Linux this summer, my experience with Linux had been more or less confined to Ubuntu, which stems from Debian. Oracle Linux, on the other hand, stems from the Fedora project and Red Hat. Though they both build on the same kernel, they diverge from each other in a few important aspects. Software available in the repositories for Debian are divided into free, non-free and contrib. All software available in Fedora’s repositories are free.

To me, the single most significant differences between Debian and Fedora (and by extension between Ubuntu and Oracle Linux) is how software is distributed. Anyone who has used Ubuntu knows the apt-get command. Try that in Oracle Linux, and you get an error. Apt-get simply doesn’t exist. The reason is that Oracle Linux doesn’t use the apt-get dependency resolver, but rather one called yum. Add to that different formats (deb and rpm), and package managers (dpkg and RPM), and you start to see the level of difference. For a list of yum commands, have a look here.

There are other differences, too, I’m sure, but none are as important for basic use and understanding of operation of the OS as the ability to install software.

Oracle Linux: Unable to connect to FTP server in terminal

In order to keep the virtual harddrive on my VM as small as possible, I prefer keeping installers on a network share that I can connect to through FTP. The terminal command for connecting to FTP servers handily enough is ftp. When running that command, terminal returned “Command not found”. It turns out that FTP is not installed on Oracle Linux by default.

Installing it is easy enough, though. In terminal, as root, run the command yum install ftp. Answer yes to all questions and hey presto; FTP is installed. The lesson here: Never assume a resource is installed. I’m sure I’ll run into more of these down the road, and it’s a good lesson to have learned, thought I suspect it is one that I will be reminded of time and again.

Oracle Linux: Unable to connect to the internet

When first installing Oracle Linux, you may run into being unable to connect to the internet. You may make the mistake of thinking that the problem is with the network settings on the host-side, and try to futz about changing what type of network connection the VM connects to. Don’t bother, it isn’t going to do any good, and you will only get annoyed.

The problem is caused on the client side, and is simple enough to solve; here’s how:

  1. On the VM, go to System > Preferences > Network Connections
  2. Highlight System eth0, and click Edit
  3. Check the check box marked “Connect Automatically”
  4. Click “Apply”

You’ll get prompted for a root password, then you’re done, and you should now be able to connect to the internet.

The Cloud Storage Price War is upon us

Back in March, Google announced that they were massively slashing the storage prices for their customers. Previously $4.99/month for 100 GB, and $49.99/month for 1 TB, they cut the prices to $1.99 and $9.99, respectively. Beyond that, they are charging $99.99 per 10 TB, meaning that 40 TB will set you back $399.96. Now, in and of itself, that is interesting, but when compared to the pricing of competitiors, it gets downright impressive.

For example, Dropbox charges $9.99 per month for 100 GB while Apple’s iCloud service runs you $20 per year (or just about $1.67 per month, billed annually) for 10 GB.

Apple and Google have their market dominance funnelling users to them; Apple because iOS users have the option of using iCloud to back up their iOS device, and Google because GMail users use Google Drive for email already, Dropbox has no such funnelling of users and income. Hence, their lack of movement on the pricing front seems ill adviced to me.

Dropbox are banking on their existing user base staying faithful, while attempting to innovate on whatever front they can. The only place where they have an advantage, and only over one of their competitors, is the iOS app’s automatic photo upload feature, which I would expect to see implemented by Google shortly.

Unless Dropbox find a way to remain relevant, I don’t think they will last long. Time will tell.

Run Control Panel applets with elevated permissions

Sometimes, you want to launch a Control Panel applet with elevated permissions. Normally, you would right-click the program you want, and select to run as an administrator. However, the control panel applets don’t give you that option, and so we need to go deeper.

As it turns out, the Control Panel applets are all located at C:\Windows\System32, and denoted by the .cpl file type. Simply right-click the one you want, and off you go. Alternatively, you can open an elevated command prompt, and run the command command applet.cpl, where applet.cpl is the name of the applet you want to run. What are those names, you ask? Here you go:

   Control panel tool             Command
   Add/Remove Programs            control appwiz.cpl
   Date/Time Properties           control timedate.cpl
   Display Properties             control desk.cpl
   Fonts Folder                   control fonts
   Internet Properties            control inetcpl.cpl
   Keyboard Properties            control main.cpl keyboard
   Mouse Properties               control main.cpl
   Multimedia Properties          control mmsys.cpl
   Network Properties             control netcpl.cpl
   Password Properties            control password.cpl
   Printers Folder                control printers
   Regional Settings              control intl.cpl
   Sound Properties               control mmsys.cpl sounds
   System Properties              control sysdm.cpl

Note that this list is in no way exhaustive, but it should do for most applications

Force uninstall when installer is unavailable

Imagine the scene; you are having a problem with a program, and the manufacturer tells you that the solution is a complete uninstall followed by a reinstall. You go to uninstall, and Windows tells you that it can’t find the installer. Looking around, nor can you. So, now you’re up a certain waterway without a certain rowing implement, aren’t you? Not necessarily.

Luckily, Microsoft has created a tool which automatically finds the registry keys in question, and lets you remove them. The tool is called FixIt, and is intuitive to more or less a fault. Keep in mind that the tool does not support the runas command, and must be run by a user that has local administrative privileges.

Tesla: All our patents are belong to you

In a move that is at once impressive and baffling to many commentators, Tesla Motors CEO Elon Musk recently announced that they are applying the open source philosophy to their portfolio of patents. While a great publicity stunt, there seems to be more to it than that. Musk said:

At Tesla, however, we felt compelled to create patents out of concern that the big car companies would copy our technology and then use their massive manufacturing, sales and marketing power to overwhelm Tesla. We couldn’t have been more wrong. The unfortunate reality is the opposite: electric car programs (or programs for any vehicle that doesn’t burn hydrocarbons) at the major manufacturers are small to non-existent, constituting an average of far less than 1% of their total vehicle sales.

At best, the large automakers are producing electric cars with limited range in limited volume. Some produce no zero emission cars at all.

By open sourcing their patents, they do run the risk of losing market shares. That is, I think, a minor concern – if it is one at all – for the company. I think one of two things will happen: Either no one will use the patents, at which point we are where we were – status quo ante. If the patents are used, Tesla’s burden in building an ecosystem around their technology is being shared by other manufacturers. Either way, by letting anyone use the patents, more smart people can do more smart thing with them, improving the electric car market – for everyone.

Oracle Linux: Insufficient memory to auto-enable kdump

Remember how I said “you should be all set”, last week? Turns out, I was only partially right. After creating a local user account, Linux also configures kdump, the kernel crash dumping mechanism. When attempting to do so, it returned this error message:

Insufficient memory to auto-enable dump. Use system-config-kdump to configure manually

As problems go, this one is fairly minor. Kdump does not need to be enabled or configured for everything else to work. I made a note of the issue, and clicked right through the error, and logged in. That said, I really don’t like leaving stuff like that without looking into resolving it. I immediately grokked that system-config-kdump refers to a shell command, opened terminal, and entered the command. It opened this window:

KDump config window

I proceeded to click “Enable”, and then “Apply”. The system asked for a reboot, and then prompted for the root password, three times. After that, it returned the following error:

Starting kdump:[FAILED]

At a hunch, I increased the kdump memory to 160 MB, then tried to apply once more. After being prompted for root password thrice, the settings were saved. Peculiarly, kdump memory was set to its original 128 MB. Following a reboot, I checked, and kdump was up and running, kdump memory still set to 128 MB.