TRUNCATE a Table with Corrupted Blocks

The customer reported that a block corruption error is logged in one of his databases. This happened for a table owned by the application we maintain. The error was:

ORA-01578: ORACLE data block corrupted (file # 8, block # 52195)
ORA-01110: data file 8: ...
ORA-26040: Data block was loaded using the NOLOGGING option

First we thought that the error is received on the production, but, later on, they have provided more details of the problem and, apparently, this error was occurring on a duplicate database they are creating every day as part of their DR strategy.

The Root Cause (aka RC)

Well, not very difficult to grasp the big picture. The scenario we had in mind was:

  1. the corrupted block was belonging to a table which, indeed, was being loaded using, yes, SqlLoader with the DIRECT option enabled.
  2. every night the production database was backed up.
  3. after the backup, the SqlLoader job took place
  4. then, the night backup and the archives generated afterwards were used to create a duplicate database in DR.
  5. in DR, BANG!!! When the NOLOGGING table is queried Oracle complains with an ORA-01578 error

Validate It

Let’s prepare the playground. First of all, we create a dummy user and its corresponding tablespace:

SQL> create tablespace guineea_pig datafile size 50M;

Tablespace created.

SQL> grant create session, create table to guineea_pig identified by ***;

Grant succeeded.

SQL> alter user guineea_pig default tablespace guineea_pig quota unlimited on guineea_pig;

User altered.

Then, we create two tables: T1 and T2. We’ll use them later to do a NOLOGGING operation as a substitute for the SqlLoader.

SQL> connect guineea_pig/xxx
Connected.
SQL> create table t1 as select * from all_objects;

Table created.

SQL> create table t2 nologging as select * from t1 where 1=2;

Table created.

Before anything else, let’s make a backup of the database (step 1 from the scenario):

RMAN> backup database include current controlfile plus archivelog;

Starting backup at 2015-02-23 17:24:41
current log archived
using channel ORA_DISK_1
channel ORA_DISK_1: starting archived log backup set

...

Finished backup at 2015-02-23 17:26:21

The above is the backup we’ll use to create the duplicated database. Now, let’s perform a NOLOGGING operation (corresponds to the SqlLoader step):

SQL> insert /*+ APPEND */ into t2 select * from t1;

66167 rows created.

SQL> commit;

Commit complete.

Ok, let’s backup just the archives now:

RMAN> backup archivelog all not backed up 1 times;

Starting backup at 2015-02-23 17:38:48
current log archived
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=44 device type=DISK
skipping archived logs of thread 1 from sequence 592 to 594; already backed up
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=595 RECID=4 STAMP=872444243
input archived log thread=1 sequence=596 RECID=5 STAMP=872444328
channel ORA_DISK_1: starting piece 1 at 2015-02-23 17:38:48
channel ORA_DISK_1: finished piece 1 at 2015-02-23 17:38:49
piece handle=/u01/app/oracle/fast_recovery_area/VENA/backupset/2015_02_23/o1_mf_annnn_TAG20150223T173848_bgpld8s2_.bkp tag=TAG20150223T173848 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 2015-02-23 17:38:49

Now, ready for creating the duplicate from the backups. In our tests, VENA is the production database and VENADUP is the duplicate. The general flow for a duplicate (pretty boring for a DBA who did this thousands of times before):

  1. create the parameter file:
a. create a dummy /tmp/venadup.ini file:

    *.audit_file_dest='/u01/app/oracle/admin/venadup/adump'
    *.audit_trail='db'
    *.compatible='11.2.0.4.0'
    *.db_block_size=8192
    *.db_create_file_dest='/u01/app/oracle/oradata'
    *.db_domain=''
    *.db_name='venadup'
    *.db_uniqu_name='venadup'
    *.db_recovery_file_dest='/u01/app/oracle/fast_recovery_area'
    *.db_recovery_file_dest_size=53687091200
    *.diagnostic_dest='/u01/app/oracle'
    *.dispatchers='(PROTOCOL=TCP) (SERVICE=venadupXDB)'
    *.memory_target=1660944384
    *.open_cursors=800
    *.processes=150
    *.remote_login_passwordfile='EXCLUSIVE'
    *.undo_tablespace='UNDOTBS1'

b. create the SPFILE:

    [oracle@venus ~]$ ORACLE_SID=venadup sqlplus / as sysdba

    SQL*Plus: Release 11.2.0.4.0 Production on Mon Feb 23 17:56:12 2015

    Copyright (c) 1982, 2013, Oracle.  All rights reserved.

    Connected to an idle instance.

    SQL> startup pfile='/tmp/venadup.ini' nomount;
    ORACLE instance started.

    Total System Global Area 1653518336 bytes
    Fixed Size                  2253784 bytes
    Variable Size            1006636072 bytes
    Database Buffers          637534208 bytes
    Redo Buffers                7094272 bytes
    SQL> create spfile from pfile='/tmp/venadup.ini';

    File created.

c. restart the instance with the new SPFILE:

    [oracle@venus ~]$ ORACLE_SID=venadup sqlplus / as sysdba

    SQL> shutdown immediate
    ORA-01507: database not mounted


    ORACLE instance shut down.
    SQL> startup nomount
    ORACLE instance started.
  1. create the password file for the duplicate database:

    [oracle@venus ~]$ cd $ORACLE_HOME/dbs [oracle@venus dbs]$ orapwd file=orapwvenadup password=**

  2. prepare the listener entry:

    SID_LIST_LISTENER = (SID_LIST = (SID_DESC = (GLOBAL_DBNAME = venadup) (ORACLE_HOME = /u01/app/oracle/product/11.2.0/4e_db) (SID_NAME = venadup) )

  3. reload the listener: lsnrctl reload

  4. prepare the TNS entries:

    VENA = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = venus.localnet)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = VENA) ) )

    VENADUP = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = venus.localnet)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = VENADUP) ) )

  5. start the duplicate process:

    [oracle@venus dbs]$ rman target=sys/**@vena auxiliary=sys/**@venadup

    Recovery Manager: Release 11.2.0.4.0 - Production on Mon Feb 23 18:31:16 2015

    Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

    connected to target database: VENA (DBID=2950581980) connected to auxiliary database: VENADUP (not mounted)

    RMAN> duplicate target database to venadup;

Now, go and query the T2 table:

[oracle@venus dbs]$ sqlplus guineea_pig/****@venadup

SQL> select count(*) from t2;
select count(*) from t2
                    *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 1155)
ORA-01110: data file 7:
'/u01/app/oracle/oradata/VENADUP/datafile/o1_mf_guineea__bgpop6gz_.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option

Good, the initial hypothesis is confirmed.

Solutions?

Well, obvious, the recommended way. Take a backup (full or incremental) of those data files affected by the direct load operations and then do the duplicate. Or, use the FORCE LOGGING on the database level, but this will affect the performance of the load operation.

Back to the Question

It turned out that the table in question was configured with NOLOGGING option because it’s just a stage table. So, after the records are processed, they are not needed anymore in this table. That’s why the title of this post: would be possible to TRUNCATE the stage table on production after the processing is finished so that, on the duplicate, querying this table to not pose any problems as it is supposed to be empty? I forgot to mention, the biggest problem for the customer was the flooding of the alert.log with that corruption error, triggered by the job which gathers statistics automatically.

Actually we want to test this:

  1. we have the night backup
  2. then we have some NOLOGGING operations which doesn’t generate redolog
  3. then a TRUNCATE TABLE is executed by the application at the end of the processing it’s doing. This statement generates redolog for the command itself.
  4. later on, the duplicate takes place. It restores the night backup and applies the redolog generated after the backup was taken. The tricky thing is that, on the duplicate database, the table in question becomes corrupted because of the lack of the necessary redolog (those DIRECT LOAD operations), but a bit further in the redolog stream which is being crunched, the TRUNCATE TABLE command is applied on the corrupted table. Will it work or will it fail miserably?

Let’s see. We assume that we have the night backup already taken and T2 table was affected by a direct load operation.

[oracle@venus ~]$ sqlplus guineea_pig/*****@vena

SQL> select count(*) from t2;

  COUNT(*)
----------
    66167

SQL> truncate table t2;

Table truncated.

[oracle@venus ~]$ rman target /

RMAN> backup archivelog all not backed up 1 times;

Now, duplicate the database and query the T2 table:

[oracle@venus ~]$ rman target=sys/****@vena auxiliary=sys/****@venadup

Recovery Manager: Release 11.2.0.4.0 - Production on Mon Feb 23 18:55:43 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: VENA (DBID=2950581980)
connected to auxiliary database: VENADUP (not mounted)

RMAN> duplicate target database to venadup;

[oracle@venus ~]$ sqlplus guineea_pig/*****@venadup

SQL> select count(*) from t2;

  COUNT(*)
----------
        0

So, the answer is YES, a truncate will work on a table with corrupted blocks because of a direct load operations.

Actually, it’s working the other way around as well, on the duplicated database:

[oracle@venus ~]$ sqlplus guineea_pig/***@venadup

SQL*Plus: Release 11.2.0.4.0 Production on Mon Feb 23 19:51:32 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning option

SQL> select count(*) from t2;
select count(*) from t2
                    *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 1155)
ORA-01110: data file 7:
'/u01/app/oracle/oradata/VENADUP/datafile/o1_mf_guineea__bgpt3506_.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option


SQL> truncate table t2;

Table truncated.

SQL> select count(*) from t2;

  COUNT(*)
----------
        0

So, in order to fix the problem, the DBA can safely truncate this table in the duplicate database, as soon as the duplicate process is finished. Again, this is just because we’re talking about a staging table. Yet, a word of caution needs to be raised: if the duplicate happens while the staging table is being processed then the duplicate database might be inconsistent as far as the application logic is concerned.

Cloning an Oracle Home

Cloning an Oracle home is required from time to time. The big advantage of using this approach is that it preserve the patches you might have been installed in that home. Of course, cloning is valid only for the same platform. Don’t expect to take a Windows Oracle home and clone it on a Linux box.

First step: backup the current Oracle home

It doesn’t matter what tool you use, but it’s important to pay attention to the files ownership. That tool needs to preserve that information. So, on Linux, we can use this:

cd $ORACLE_HOME
tar cf - * | gzip > /tmp/oracle_home.tar.gz

In $ORACLE_HOME is the home you want to clone, the source.

Second step: restore on the target

The target can be on the same host or on a different machine:

mkdir <target_new_oracle_home>
cd <target_new_oracle_home>
gunzip < /tmp/oracle_home.tar.gz | tar xf -

Third step: finalize the cloning process

Now, we need to reconfigure and attach the new home in the Oracle inventory. So, you need to do this:

[oracle@mercur 2e_db_cloned]$ oui/bin/runInstaller \
  -clone \
  -silent ORACLE_HOME=/u01/app/oracle/product/12.1.0/2e_db_cloned \
  ORACLE_HOME_NAME="Oracle_12c_cloned" \
  ORACLE_BASE=/u01/app/oracle \
  OSDBA_GROUP=dba \
  OSPER_GROUP=dba

Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4061 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2015-01-26_06-19-33PM. Please wait ...
Oracle Universal Installer, Version 12.1.0.2.0 Production
Copyright (C) 1999, 2014, Oracle. All rights reserved.

... output truncated ...

Linking in progress (Monday, January 26, 2015 6:19:47 PM EET)
.                                                                81% Done.
Link successful

Setup in progress (Monday, January 26, 2015 6:20:21 PM EET)
..........                                                      100% Done.
Setup successful

Saving inventory (Monday, January 26, 2015 6:20:22 PM EET)
Saving inventory complete
Configuration complete

End of install phases.(Monday, January 26, 2015 6:20:44 PM EET)
WARNING:
The following configuration scripts need to be executed as the "root" user.
/u01/app/oracle/product/12.1.0/2e_db_cloned/root.sh
To execute the configuration scripts:
    1. Open a terminal window
    2. Log in as "root"
    3. Run the scripts

The cloning of Oracle_12c_cloned was successful.
Please check '/u01/app/oraInventory/logs/cloneActions2015-01-26_06-19-33PM.log' for more details.

Just run the root scripts and you’re good to go.

Response Files

One of the frequent tasks a DBA must carry out is to install Oracle software on new hosts, along with, presumably, new Oracle databases. Usually, it’s just a matter of launching the Oracle installer or “dbca” utility and then following the wizard. Not a big deal! However, this approach has two inconveniences:

  • you need a X server in order to use the GUI
  • the lack of standardization: it’s hard to remember all the options, locations and so on, not to mention it’s error prone
  • if many installations are to be done, it’s not funny to click-click through all those wizard pages

Oracle addresses these kind of problems by means of:

  • silent installations
  • response files

Response Files Location

As an example, we’re going to install a new Oracle 12c using the silent/response files facility. We assume that the server is already prepared for such an installation, so all prerequisites have been already taken care of.

The first step is to install the Oracle software. After we downloaded the required archive and unzip it, on Linux, we end up with something like this:

[oracle@mercur 12102_linux64]$ find . -mindepth 2 -maxdepth 2 -printf '%M %u %p\n'
drwxrwx--- root ./database/rpm
drwxrwx--- root ./database/stage
-rwxrwx--- root ./database/runInstaller
drwxrwx--- root ./database/sshsetup
-rwxrwx--- root ./database/readme.html
-rwxrwx--- root ./database/welcome.html
drwxrwx--- root ./database/install
drwxrwx--- root ./database/response

The template response files are in the “database/response” folder:

[oracle@mercur 12102_linux64]$ ls -al database/response/
total 120
drwxrwx--- 1 root vboxsf  4096 Jul  7  2014 .
drwxrwx--- 1 root vboxsf  4096 Jan 19 12:02 ..
-rwxrwx--- 1 root vboxsf 74822 Apr  4  2014 dbca.rsp
-rwxrwx--- 1 root vboxsf 25036 Jul  7  2014 db_install.rsp
-rwxrwx--- 1 root vboxsf  6038 Jan 24  2014 netca.rsp

So, there’s a template for the software installation, one for the network setup and one for the database installation using “dbca”.

Create and Use the Response File

If a RSP file is already provided, the simplest method is to make a copy of it and edit it so that to be customized to the DBA/organization standard.

The other option is to record a RSP file using the installer if, of course, such an option exists for that installer. For example, up to 11.2, the Oracle Universal Installer used to have the “-record” parameter which might have be used in order to generate a response file. Interestingly enough, this option is not a valid one anymore in 12c.

As soon as you have the response file, you may start the installation:

[oracle@mercur database]$ ./runInstaller -silent -responseFile /media/sf_bazar/orakits/rsps/db_install_12102_ee_norac.rsp -waitforcompletion
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 500 MB.   Actual 39043 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 4086 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2015-01-19_02-12-48PM. 
Please wait ...You can find the log of this install session at:
/u01/app/oraInventory/logs/installActions2015-01-19_02-12-48PM.log
The installation of Oracle Database 12c was successful.
Please check '/u01/app/oraInventory/logs/silentInstall2015-01-19_02-12-48PM.log' for more details.

As a root user, execute the following script(s):
        1. /u01/app/oraInventory/orainstRoot.sh
        2. /u01/app/oracle/product/12.1.0/2e_db/root.sh

Successfully Setup Software.

As root, run the above two scripts and you’re good to go.

To silently create the listener, we can also use the Oracle provided RSP file. For a standard LISTENER, there’s no need to edit anything.

[oracle@mercur response]$ /u01/app/oracle/product/12.1.0/2e_db/bin/netca -silent -responsefile /media/sf_bazar/orakits/12102_linux64/database/response/netca.rsp 

Parsing command line arguments:
    Parameter "silent" = true
    Parameter "responsefile" = /media/sf_bazar/orakits/12102_linux64/database/response/netca.rsp
Done parsing command line arguments.
Oracle Net Services Configuration:
Profile configuration complete.
Oracle Net Listener Startup:
    Running Listener Control: 
      /u01/app/oracle/product/12.1.0/2e_db/bin/lsnrctl start LISTENER
    Listener Control complete.
    Listener started successfully.
Listener configuration complete.
Oracle Net Services configuration successful. The exit code is 0

Finally, a database can be created using a response file. You can use the following command:

[oracle@mercur ~]$ dbca -silent -responseFile /media/sf_bazar/orakits/rsps/dbca_12c_noasm_pluggable.rsp                                                                                        
Enter SYS user password: 

Enter SYSTEM user password: 

Enter PDBADMIN User Password: 

Cleaning up failed steps
4% complete
Copying database files
5% complete
6% complete
12% complete
17% complete
30% complete
Creating and starting Oracle instance
32% complete
35% complete
36% complete
37% complete
41% complete
44% complete
45% complete
48% complete
Completing Database Creation
50% complete
53% complete
55% complete
63% complete
71% complete
74% complete
Creating Pluggable Databases
79% complete
100% complete
Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/mercdb/mercdb.log" for further details.

Mission accomplished! It is important to note that the vast majority of installers accept a wide range of parameters which corresponds to options from the response file. The argument value from the command line has precedence.

Common Issues

Relative Paths

Pay attention not to provide the response file with a relative location. Apparently, OUI is not smart enough to figure out how to handle this scenario:

[oracle@mercur database]$ ./runInstaller -silent -responseFile ../../rsps/db_install_12102_ee_norac.rsp
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 500 MB.   Actual 39052 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 4086 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2015-01-19_01-06-47PM. Please wait ...[oracle@mercur database]$ [FATAL] [INS-10101] The given response file ../../rsps/db_install_12102_ee_norac.rsp is not found.
  CAUSE: The given response file is either not accessible or do not exist.
  ACTION: Give a correct response file location. (Note: relative path is not supported)
A log of this session is currently saved as: /tmp/OraInstall2015-01-19_01-06-47PM/installActions2015-01-19_01-06-47PM.log. Oracle recommends that if you want to keep this log, you should move it from the temporary location.

Background Installation

By default, the OUI is launching the installation session in background as a separate process. Personally, I don’t like this style as it force me to immediately go and tail the log file. Furthermore, in Linux, the console is not completely detached and random messages are printed by the background install process. In order to avoid this, you must provide the -waitforcompletion parameter. The problem I’ve noticed however when using this parameter is that the log file is automatically deleted at the end of the installer command. Yet, another stupidity of Oracle.

OUI Complains about My Oracle Support

If you get:

[oracle@mercur database]$ ./runInstaller -silent -responseFile /media/sf_bazar/orakits/rsps/db_install_12102_ee_norac.rsp -waitforcompletion
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 500 MB.   Actual 39052 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 4086 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2015-01-19_01-27-51PM. Please wait ...
[WARNING] - My Oracle Support Username/Email Address Not Specified
A log of this session is currently saved as: /tmp/OraInstall2015-01-19_01-27-51PM/installActions2015-01-19_01-27-51PM.log. 
Oracle recommends that if you want to keep this log, you should move
it from the temporary location.

Check if the DECLINE_SECURITY_UPDATES variable is set to true into the response file.

Things I don’t Like

Below are the things I don’t like as far as the response files feature is concerned:

  • The response file is a static dumb file. You cannot take values from environment variables or interact with other external interfaces. However, I think you can use m4 parser to generate a new response file on the fly.
  • The character-sets from the provided dbca.rsp template file are kind of stupid. For example, for the default characterset, the is US7ASCII assumed which isn’t such a good idea.
  • I couldn’t find anything related to OBFUSCATEDPASSWORDS parameter. How is it supposed to obfuscate these passwords? You may find some clues if you opt to install the database using OUI and then if you’re take a look into the log. You may find something like this:

    INFO: Configuration assistant "Oracle Net Configuration Assistant" succeeded 
    INFO: Command = /export/home/oracle/product/12.1.0/Db_3/bin/dbca  
            -progress_only 
            -createDatabase 
            -templateName General_Purpose.dbc 
            -gdbName orcl -sid orcl  
            -sysPassword 052b3cccab83184dd1ba5e728aa1ff1e189a90088c2da70291  
            -systemPassword 053e5dc6ab23642c16821c915b858ddc23d87c4a357e140d8e  
            -sysmanPassword 05b123a3abe32968b37230926a9bfc13f4d4ef5e52f01e095d  
            -dbsnmpPassword 058be1aeaba392a9299e7d3d8215720402fe045d76a03d4506  
            -emConfiguration LOCAL     
            -datafileJarLocation /export/home/oracle/product/12.1.0/Db_3/assistants/dbca/templates  
            -datafileDestination /export/home/oracle/product/12.1.0/oradata/ 
            -responseFile NO_VALUE   
            -characterset WE8ISO8859P1   
            -obfuscatedPasswords true
    

    How those passwords are encrypted I have no clue. As an workaround you can get those values from the log and use it in your own response file. Of course, this is cumbersome and something you don’t want to do too often.

  • You can’t specify the options you want to install directly into the response file. There’s nothing related to “Spacial” option or “Apex” there. The only way is to create your own custom template and use it in the response file.

DBCA segmentation fault

Finally, I have succeeded to install the Oracle 12.1.0.2 database! Big deal, you’d say! Well, yes! Usually, installing a new Oracle database on a fresh server shouldn’t be a problem: ./runInstaller, next, next, software only, next next, dbca… In my case there was no problem installing the software, but when I was about to run the “dbca” tool, it always complained with something like this:

[oracle@mercur ~]$ dbca
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f7a21db777f, pid=2856, tid=140162988832512
#
# JRE version: 6.0_75-b13
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.75-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libclntshcore.so.12.1+0x1ba77f]  long+0x8f
#
# An error report file with more information is saved as:
# /home/oracle/hs_err_pid2856.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Aborted (core dumped)

Ouch! It turned out that the problem was my NLS_DATE_FORMAT environment variable:

[oracle@mercur ~]$ echo $NLS_DATE_FORMAT
yyyy-mm-dd hh 24:mi:ss

Note that stupid blank between “hh” and “24”. After I fixed the format my “dbca” was happy again.

My Portable Oracle Playground

Have I told you I bought a new laptop? Yep, it’s a Lenovo W530. I was looking for a powerful toy, extensible, not very expensive and not too heavy. Apparently, this W530 bad boy meets all the requirements: it has a CPU with 4 hyper-threaded cores, it supports up to 32GB of RAM, another HDD can be added by replacing the CD-ROM with a hard drive caddy, it had a price of ~1000$ on EBay (refurbished item) and it weights ~3kg. Ah, not to mention the keyboard. I hate the numeric pad on a laptop, that’s why I said NO to the new Lenovo W540. Another requirement I paid attention to was the display. I needed to be anti-glare. The old laptop I owned had a glare display and it was a PITA to use it when light was pointing directly to the display. I got sick and tired of my face mirrored in that fancy glare display. Anyway, let’s cut to the chase and talk about the way I configured my new laptop.

The OS

First of all, I rapidly get rid of the pre-installed Windows 8 operating system and I installed Arch Linux. I love Arch Linux for its simplicity and for being a rolling upgrade OS. In the past I wasn’t amused at all with Ubuntu or Fedora, especially when it came to upgrade to the next major version. What happened? Well, a lot of my customizations were simply lost or not compatible with the new version. With Arch Linux I’m always up to date and I don’t bother with major versions at all. Of course, sometimes things break on Arch Linux as well and manual intervention is needed, but most of these cases are well documented on the community front page of Arch Linux users. Not to mention their huge collection of wiki pages which are great for learning.

As far as the desktop manager is concerned I am a big fan of XFCE which is in line with the simplicity of the Arch Linux in general. So, I don’t waste my time with fancy desktop managers like Gnome Shell or Unity. Maybe I’m getting old and that’s why I find it difficult to get into this new style of managing the desktop.

The Goals

Arch Linux is not a supported OS for Oracle, that’s for sure. But I didn’t want to pollute the system with all kind of Oracle versions anyway. My plan was to build up a portable Oracle playground, everything being wrapped around the well known virtualization tool (yep, I know you know it), VirtualBox. Unlike its counter part, VMWare, I like VirtualBox more because of its simplicity. Maybe also because of my bias for Oracle, Virtualbox being under the same Oracle umbrella? Guilty as charged! Of course, there are also many things I don’t like when it comes to Virtualbox. For example, why it is not possible to detach the window of a running VM? I know I can use the headless mode, but still…

Note: Recent versions of VMWare knows about the “detachable start” feature.

Let’s summarize the main goals I want to achieve:

  1. multiple VMs running on my laptop
  2. these virtual hosts must run in their isolated network. I don’t want to let those VMs to register in my home/company network.
  3. full control of the DNS. I want to access every VM using a nice DNS name and those names must be under my full control. I don’t want to bother a network admin to create a SCAN address for me in order to be able to build my own guineea pig RAC.
  4. the VMs must have access to the internet, especially for yum updates or other administrative tasks.
  5. creating a new VM should be as easy as possible and automatic. I don’t want to click, click through all those boring installers.

Tools I Used

Well, nothing special. Let’s see:

  1. bind for my own DNS server
  2. VirtualBox as the tool of choice for virtualization
  3. Packer, in order to automate the born of a new VM
  4. squid, to be able to give internet access to the virtual hosts
  5. bash, for some custom scripts

Folders Layout

I’d like to keep things clear, therefore everything related to this virtual playground was nicely packed within the following folders layout:

mkdir -p /vmstage/hosts
mkdir -p /vmstage/packer/bin
mkdir -p /vmstage/packer/cache
mkdir -p /vmstage/packer/config
mkdir -p /vmstage/packer/output
mkdir -p /vmstage/packer/public
mkdir -p /vmstage/packer/scripts
mkdir -p /vmstage/share/disks
mkdir -p /vmstage/share/bazar

The OS user I use on my laptop is “talek”, so I also had to change the ownership:

chown talek:talek /vmstage -R

A short description of the folders above:

  • in /vmstage/hosts are all my VMs
  • in /vmstage/packer I put everything related to automating the provisioning of a new VM.
  • in /vmstage/shared is everything which is shared across VMs. The “bazar” is a public share and in “disks” are those shared disks involved especially in RAC configurations.

VirtualBox Setup

Arch Linux guys provide the VirtualBox package from their official repository. You may use the this guide to complete the whole VirtualBox installation.

As soon as the installation is finished, two settings are important to be configured:

  1. in “Preferences/General” set the /vmstage/hosts as the default machine folder.
  2. in “Preferences/Network” add a host-only network called “vboxnet0”. Double check the IP address of this interface using the configuration button from the same preferences window. It should be 192.168.56.1.

DNS Setup

I decided to create an isolated network for my VMs, called “localnet”. So, I have registered my laptop as such (note the domain name):

[talek@hugsy ~]$ cat /etc/hostname
hugsy.localnet

Then, assuming you have installed “bind” from the official repository, go ahead and add the following entries in your /etc/named.conf file:

zone "localnet" IN {
        type master;
        file "localnet.zone";
        allow-update { none; };
        notify no;
};

zone "56.168.192.in-addr.arpa" {
        type master;
        file "192.168.56.in-arpa.zone";
};

In the same /etc/named.conf file set the following properties:

options {
    listen-on { 127.0.0.1; 192.168.56.1; };
    forwarders {192.168.1.1;8.8.8.8;};
};

By these settings we ask bind to listen on localhost and on the host-only VirtualBox address. The forwarders are used to delegate the outside internet requests to the real ISP providers. In the above example, the first IP is my home rooter address, the second one is from my office.

The first zone is for registering the VM hosts names, the other one is for reverse DNS, which is especially important for RAC setups. The actual files are:

[talek@hugsy ~]$ cat /var/named/localnet.zone
$TTL 7200
; localnet
@       IN      SOA     ns01.localnet. postmaster.localnet. (
                                        2007011601 ; Serial
                                        28800      ; Refresh
                                        1800       ; Retry
                                        604800     ; Expire - 1 week
                                        86400 )    ; Minimum
                IN      NS      ns01
                IN      NS      ns02
ns01            IN      A       0.0.0.0
ns02            IN      A       0.0.0.0
localhost       IN      A       127.0.0.1
@               IN      MX 10   mail
imap            IN      CNAME   mail
smtp            IN      CNAME   mail
@               IN      A       0.0.0.0
www             IN      A       0.0.0.0
mail            IN      A       0.0.0.0
@               IN      TXT     "v=spf1 mx"
gateway         IN      A       192.168.56.1
mercur          IN      A       192.168.56.2

[talek@hugsy ~]$ cat /var/named/192.168.56.in-arpa.zone 
$ORIGIN 56.168.192.IN-ADDR.ARPA.

@                       1D IN SOA       localnet. ns.localnet. (
                                        42              ; serial (yyyymmdd##)
                                        3H              ; refresh
                                        15M             ; retry
                                        1W              ; expiry
                                        1D )            ; minimum ttl

                        1D IN NS        localnet.
1                       1D IN PTR       localnet.
2                          IN PTR       mercur.localnet.

As you can notice, I have already registered “mercur” host. Now, we need to tell the system to resolve names through our local DNS server. In order to do this I changed the /etc/resolv.conf file as shown below:

search localnet
nameserver 127.0.0.1

In addition, just to avoid the overwrite of this file by the dhcp daemon, I set the immutable flag:

chattr +i /etc/resolv.conf

Almost done! The “named” service must be enabled:

systemctl enable named

Another thing I wanted to happen was to be able to edit and restart the DNS server from my personal user. So, I added myself to the “named” group:

gpasswd -a talek named

Then I set the ownership and the rights for the zone files as below:

chown root:named /var/named/localnet.zone
chown root:named /var/named/192.168.56.in-arpa.zone
chmod g+w /var/named/localnet.zone
chmod g+w /var/named/192.168.56.in-arpa.zone

To restart the DNS server from “talek” user, I used the well-known “sudo”. I just added these using the “visudo” command:

talek hugsy =NOPASSWD: /usr/bin/systemctl restart named

With this setting in place I can easily do:

sudo systemctl restart named

Great, done with the DNS!

Packer Setup

Packer IO is not part of the official Arch Linux repositories. However, it is available on AUR and with yaourt is just a matter of:

yaourt packer

Then, I configured a packer template to install an Oracle Linux 6.3.

[talek@hugsy ~]$ cat /vmstage/packer/config/ol63_x64_non_X.json
{
  "variables": {
    "name": "",
    "memory": "1024",
    "ip": "127.0.0.1"
  },
  "builders": [{
    "type": "virtualbox-iso",
    "guest_os_type": "Oracle_64",
    "iso_url": "/vmstage/share/bazar/oskits/ol63_x64/V33411-01.iso",
    "iso_checksum": "3D1F4B2D17176821E32281B9B4A3D000",
    "iso_checksum_type": "none",
    "boot_command": [
      "<esc>",
      " linux ksdevice=eth0 ks=http://:/ol63_non_x.ks",
      "<enter>",
      "<wait>"
    ],
    "boot_wait": "5s",
    "disk_size": 50000,
    "ssh_username": "root",
    "ssh_password": "muci123",
    "http_directory" : "/vmstage/packer/public",
    "http_port_min" : 9001,
    "http_port_max" : 9001,
    "headless" : false,
    "shutdown_command" : "shutdown -h now",
    "shutdown_timeout" : "10s",
    "vm_name" : "",
    "output_directory" : "/vmstage/packer/output/",
    "vboxmanage": [
      ["modifyvm", "", "--memory", ""],
      ["modifyvm", "", "--cpus", "1"],
      ["modifyvm", "", "--nic1", "nat"],
      ["sharedfolder", "add", "", "--name", "bazar", "--hostpath", "/vmstage/share/bazar"]
    ]

  }],
  "provisioners": [{
    "type": "shell",
    "script": "/vmstage/packer/scripts/vm_oel_post_setup.sh",
    "environment_vars" : ["PACKER_VMNAME=", "PACKER_VMIP="]
  }]
}

I will not cover all the Packer.IO details. They are described on Packer official web site. However, note that I downloaded the Oracle Linux ISO and I put it in /vmstage/share/bazar/oskits/ol63_x64 folder. This file is referenced in iso_url attribute. The final provisioned image will be stored in the /vmstage/packer/output folder and it will have a preconfigured VBox share pointing to /vmstage/share/bazar. Other interesting attributes are boot_command and http_directory. Packer uses them in order to boot the machine using a kickstart template which is provided in /vmstage/packer/public. The kickstart file is this:

[talek@hugsy ~]$ cat /vmstage/packer/public/ol63_non_x.ks
#platform=x86, AMD64, or Intel EM64T
#version=DEVEL
# Firewall configuration
firewall --disabled
# Install OS instead of upgrade
install
# Use CDROM installation media
cdrom
# Root password
rootpw --iscrypted $1$mcvra6e/$B3vCQ0OpTXdOZf/YNDU1O.
# System authorization information
auth  --useshadow  --passalgo=sha512
# Use text install
text
firstboot --disable
# System keyboard
keyboard us
# System language
lang en_US
# SELinux configuration
selinux --disabled
# Installation logging level
logging --level=info
# Reboot after installation
reboot
# System timezone
timezone --isUtc Europe/Bucharest
# Use eth0 by default
network --device eth0
#network --device eth1
# System bootloader configuration
bootloader --location=mbr
# Partition clearing information
clearpart --all
zerombr
# Disk partitioning information
part / --fstype="ext4" --grow --size=1

%packages
@base
kernel-devel
kernel-headers
gcc
binutils
make
patch
glibc-headers
glibc-devel

%end

As soon as the OS in the virtual host is installed the Packer “provisioners” kick in. In my configuration file I use a script which does all the post installation tasks:

[talek@hugsy ~]$ cat /vmstage/packer/scripts/vm_oel_post_setup.sh 
#!/bin/bash

sleep 10
echo $PACKER_VMNAME
hostname $PACKER_VMNAME
echo "NETWORKING=yes" > /etc/sysconfig/network
echo "HOSTNAME=$PACKER_VMNAME" >> /etc/sysconfig/network
echo "GATEWAY=192.168.56.1" >> /etc/sysconfig/network

echo "IPADDR=$PACKER_VMIP" >> /etc/sysconfig/network-scripts/ifcfg-eth0
echo "DNS1=192.168.56.1" >> /etc/sysconfig/network-scripts/ifcfg-eth0
echo 'IPV6INIT="no"' >> /etc/sysconfig/network-scripts/ifcfg-eth0
echo 'BOOTPROTO="static"' >> /etc/sysconfig/network-scripts/ifcfg-eth0

printf "\n$PACKER_VMIP ${PACKER_VMNAME}.localnet ${PACKER_VMNAME}\n" >> /etc/hosts

printf "\nAddressFamily inet\n" >> /etc/ssh/sshd_config
printf "\nGSSAPIAuthentication no\n" >> /etc/ssh/sshd_config
printf "\nUseDNS no\n" >> /etc/ssh/sshd_config
printf "\nHost *\n" >> /etc/ssh/ssh_config
printf "\n  AddressFamily inet\n" >> /etc/ssh/ssh_config

printf "\nexport http_proxy='http://192.168.56.1:8080'\n" >> /etc/profile
printf "\nexport ftp_proxy='http://192.168.56.1:8080'\n" >> /etc/profile

printf "\nproxy=http://192.168.56.1:8080\n" >> /etc/yum.conf

printf "\nappend domain-name \"localnet\";\n" >> /etc/dhcp/dhclient-eth0.conf

mount -o loop /root/VBoxGuestAdditions.iso /mnt
/mnt/VBoxLinuxAdditions.run --nox11

exit 0

The script above configures the networking of the new provisioned host and it installs the VBox guest additions package.

The Proxy Setup

Our VMs talks to each other using the VirtualBox Host-Only adapter. This adapter supports the concept of an isolated network I was talking about at the very beginning of this post. The network traffic can take place just between VMs and the host, but not outside. This means that the main host can reach the Internet using the other non-VBox adapter, but the VMs can’t. So, the workaround I found is to use a proxy which listens on the Host-Only adapter and then funnels the outside requests to the other non-VBox adapter.

So, I installed “squid” from the official repositories:

pacman -S squid

Usually it’s just a matter of changing the bind address of the server in the configuration file, which is /etc/squid/squid.conf.

The last step was to enable the squid service:

systemctl enable squid

Automation Baby

Everything needs to be glued nicely so that to be able to create a new VM as easy as possible. So, in /vmstage/Packer/bin I created the following bash script:

#!/bin/bash

PACKER_DIR="$(dirname $( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd ))"
EDITOR="/home/talek/vim/bin/vim"

printf 'VM Host Name: '
read host_name

printf 'Memory footprint: '
read memory

echo "Add $host_name to DNS..."
$EDITOR -c 'silent edit /var/named/192.168.56.in-arpa.zone | silent split /var/named/localnet.zone'

echo "Restart DNS daemon..."
sudo systemctl restart named

echo "Check DNS registration..."
ip=`dig ${host_name} A +short +search | tail -n1`
echo "IP address of $host_name is: $ip"

if [ -z "$ip" ]; then
  echo "DNS registration error!"
  exit 1
fi

nslookup $ip > /dev/null
if [ $? -ne 0 ]; then
  echo "Reverse DNS registration error!"
  exit 1
else
  echo "Reverse DNS ok."
fi

printf "\n\n***Choose what do you want to provision***\n\n"
PS3='OS template: '
options=($(ls $PACKER_DIR/config))
select opt in "${options[@]}"
do
  echo "you selected: $opt"
  export PACKER_CACHE_DIR='/vmstage/packer/cache'
  packer-io build -var "name=$host_name" -var "memory=$memory" -var "ip=$ip" $PACKER_DIR/config/$opt

  echo "Importing image..."
  VBoxManage import ${PACKER_DIR}/output/${host_name}/${host_name}.ovf --options keepallmacs
  vboxmanage modifyvm ${host_name} --nic1 hostonly
  vboxmanage modifyvm ${host_name} --hostonlyadapter1 "vboxnet0"

  rm -rf $PACKER_DIR/output/$host_name
  break
done

printf "\n\n***done***\n\n"

Of course, I gave execute rights to this script and now I launch it as often as needed. Below is a sample of the output of this script:

[talek@hugsy bin]$ ./new_host
VM Host Name: mercur
Memory footprint: 2048
Add mercur to DNS...
Restart DNS daemon...
Check DNS registration...
IP address of mercur is: 192.168.56.2
Reverse DNS ok.


***Choose what do you want to provision***

1) ol63_x64_non_X.json
2) ol63_x64_with_X.json
OS template: 1
you selected: ol63_x64_non_X.json
virtualbox-iso output will be in this color.

Warnings for build 'virtualbox-iso':

* A checksum type of 'none' was specified. Since ISO files are so big,
a checksum is highly recommended.

==> virtualbox-iso: Downloading or copying Guest additions
    virtualbox-iso: Downloading or copying: file:///usr/lib/virtualbox/additions/VBoxGuestAdditions.iso
==> virtualbox-iso: Downloading or copying ISO
    virtualbox-iso: Downloading or copying: file:///vmstage/share/bazar/oskits/ol63_x64/V33411-01.iso
==> virtualbox-iso: Starting HTTP server on port 9001
==> virtualbox-iso: Creating virtual machine...
==> virtualbox-iso: Creating hard drive...

....

Successfully imported the appliance.


***done***

Mission accomplished! I know there’s a lot of room for improvements but for now I’m quite pleased with the results.