Ansible Tower – Custom Credentials Type

Within playbooks you occasionally connect to external applications or services, in my case Zabbix and ServiceNow. Because I also need login details and do not want to leave this plain text in playbooks, I use a 'Custom Credentials Type'. The advantage of this is that I can use the login details within a playbook (as a macro) and they are stored encrypted in Ansible Tower.

I first create a new credential type by defining the fields it will have and how these will be passed to my playbook. Credential types consist of two parts – “inputs” and “injectors“.

  • Inputs:
    define the value types that are used for this credential – such as a username, a password, a token, or any other identifier that’s part of the credential.
  • Injectors:
    describe how these credentials are exposed for Ansible (or us) to use – this can be Ansible extra variables, environment variables, or templated file content.

Both these configurations are specified as YAML or as JSON. In my case, the new credential type is called "ServiceNow" and i’m providing the instance, username and password as part of this credential type:

fields:
  - id: instance
    type: string
    label: ServiceNow Instance
  - id: username
    type: string
    label: ServiceNow Username
  - id: password
    type: string
    label: ServiceNow password
    secret: true
required:
  - instance
  - username
  - password

Then in the Injector configuration:

extra_vars:
  snow_instance: '{{ instance }}'
  snow_password: '{{ password }}'
  snow_username: '{{ username }}'

Now go to Credentials and add a new one, selecting "ServiceNow" as Credential Type:

Thats it! When you link this credential to your host, or playbook, you can use this credentials from within your playbook!

Enable ‘Previous Versions’

Anyone who’s ever trashed a spreadsheet, or accidentally deleted a file, will appreciate the 'previous versions' function. However, you will only find out that this is not enabled by default when it is already too late.

You can enable previous versions by enabling shadow copies at a ‘volume’ level, Server Manager> Tools> Computer Management > Share Folders > Configure Shadow Copies > Select the Volume > Enable. It will take about 15% of your space, so make sure you have enough room.

In my case i want a copy each hour, go to Advanced Schedule Options interface, select Repeat task, and then set the frequency to every 1 hours, then Select Time, and then change the time value to 2:58 AM.

Enable LLDP on Windows Server 2016/2019

The Link Layer Discovery Protocol (LLDP) is a vendor-neutral link layer protocol used by network devices for advertising their identity, capabilities, and neighbors on a local area network based on IEEE 802 technology, principally wired Ethernet. The protocol is formally referred to by the IEEE as Station and Media Access Control Connectivity Discovery specified in IEEE 802.1AB and IEEE 802.3 section 6 clause 79. More info here

The following will install the DatacenterBridging feature and enable lldp and all interfaces:

Enable-WindowsOptionalFeature -Online -FeatureName 'DataCenterBridging'
Get-NetAdapter | Where-Object { $_.Name -like "*Ethernet*" -and $_.Status -eq 'Up' } | ForEach { Enable-NetLldpAgent -NetAdapterName $_.Name -Verbose }

Mysql Clear Diskspace

When you are running out of diskspace you can purge the MySQL binary logs to free up some space

mysql> PURGE BINARY LOGS BEFORE 'yyyy-mm-dd hh:mm:ss';

Sometimes you are already on 99% disk space and need more drastic methods. This requires manually removing the logfiles.

systemctl stop mysql
cd /var/llog/mysql && a=`ls |grep -v relay |grep bin.index` && b=`wc -l <$a` ; c=`echo $(($b/2))` |xargs -l rm ; echo $c | head -n $b $a |cut -d "/" -f2 && sed 1,$c\d $a -i
systemctl start mysql

Enable NTP Server in Windows 2019

The Windows Time service uses the Network Time Protocol (NTP) to help synchronize time across a network. It's as easy as 3 commands using powershell:

Set-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Services\w32time\TimeProviders\NtpServer" -Name "Enabled" -Value 1
Set-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\services\W32Time\Config" -Name "AnnounceFlags" -Value 5 
Restart-Service w32Time

Check nvme health and temperature – nvme-cli

Make sure nvme-cli is installed:

$ sudo apt install nvme-cli

Check for availible nvme disks:

$ sudo nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     S4EVNFXXXXXXXX9972H      Samsung SSD 970 EVO Plus 500GB           1          26,60  GB / 500,11  GB    512   B +  0 B   2B2XXXXXM7

With nvme-cli you can now check the internal temperature, disk usage, power cycles, and much more:

$ sudo nvme smart-log /dev/nvme0
Smart Log for NVME device:nvme0 namespace-id:ffffffff
critical_warning                    : 0
temperature                         : 40 C
available_spare                     : 100%
available_spare_threshold           : 10%
percentage_used                     : 0%
data_units_read                     : 90935
data_units_written                  : 119679
host_read_commands                  : 4491381
host_write_commands                 : 2370351
controller_busy_time                : 8
power_cycles                        : 34
power_on_hours                      : 9
unsafe_shutdowns                    : 1
media_errors                        : 0
num_err_log_entries                 : 0
Warning Temperature Time            : 0
Critical Composite Temperature Time : 0
Temperature Sensor 1                : 40 C
Temperature Sensor 2                : 38 C
Thermal Management T1 Trans Count   : 0
Thermal Management T2 Trans Count   : 0
Thermal Management T1 Total Time    : 0
Thermal Management T2 Total Time    : 0

Ubuntu 18.04 – OpenVPN Server in less then 5 minutes

OpenVPN provides flexible VPN solutions to secure your data communications, whether it's for Internet privacy, remote access for employees, securing IoT, or for networking Cloud data centers. Our VPN Server software solution can be deployed on-premises using standard servers or virtual appliances, or on the cloud.

Prepare your system

Make sure all latests packages and updates have been installed:

$ sudo apt update
$ sudo apt upgrade
$ sudo apt dist-upgrade

Download and run installation script

$ wget https://git.io/vpn -O openvpn-install.sh
$ sudo chmod +x openvpn-install.sh
$ sudo ./openvpn-install.sh 

The script will ask you some questions for it's basic configuration.
- When your IP address is asked, choose your WAN (public) address
- When protocol is asked, i recommend default UDP
- Port can be anything you want, i normally keep default
- When asked, pick 1.1.1.1 as your DNS server as this is one of the fastest currently online.

After this the installation will go ahead and inform you when it's done. You can verify if OpenVPN is running or not:

$ sudo systemctl status openvpn@server # <--- get server status

You can start or stop OpenVPN with the following commands:

$ sudo systemctl stop openvpn@server # <--- stop server
$ sudo systemctl start openvpn@server # <--- start server

Client configuration

At the end of the installation you whould see a message like this:

Your client configuration is available at: /root/bontekoe.ovpn

As i am using Linux (Ubuntu) on my laptop, i can simply copy that ovpn file to my computer using scp.

$ sudo scp root@88.99.189.27:/root/bontekoe.ovpn /etc/openvpn/client.conf

This should be enough to connect! Check if everything is working by running:

$ sudo openvpn --client --config /etc/openvpn/client.conf

Now, by opening another terminal you should be able to ping 10.8.0.1 (the VPN host).

If you are running windows, there is a client here.

Ubuntu 18.04 – Laggy bluetooth

After installing this version my mouse became laggy and also my headphones. Here is the fix:

# HANDLE="$(hcitool con | grep '<Bluetooth Mouse mac address>' | awk '{print $5}')"  # get the device handle
# hcitool lecup --handle $HANDLE --latency 0 --min 6 --max 8

Network Monitoring – Traps vs. Polling

As a network administrator, I have been (partly) responsible for the monitoring of network infrastructures or even entire companies for many customers. For some companies I have even completely redesigned it.

Many companies currently use tools such as Nagios, Solarwinds or a similar package. These tools are ( at least, in my opinion) a first-generation software package because it takes a lot of time to set the triggers (in particular). If you also have to deal with different vendors, it becomes even more complex; every vendor has its own event codes and descriptions, which causes unequal and therefore unclear alerts.

Many parties also rely on the use of SNMP Traps. Although you will not be able to completely disable SNMP-Traps, relying on (only) these traps is dangerous. This has various reasons.

Why we cannot trust SNMP Traps

SNMP Traps work on the basis of UDP and is a single datagram which is sent by the device. Consider for example a temperature that exceeds the limit, for which a switch will send a single UDP datagram once. However, UDP does not work in the same way as TCP, because there is no check whether the UDP datagram has ever arrived. TCP has TCP Retransmissions for this, UDP has no control whatsoever.

In the real world, it can happen that a switch gets into trouble due to (for example) spannig-tree recalculation and the SNMP-Trap does not arrive as a result. These notifications will not be sent again and will never be registered. Same thing is the connection between your monitoring host and the device is unstable. Not something you can count on in the event of disruptions.

Next-generation alerts

As I wrote above, I see the mentioned software packages as first-generation monitoring, this is mainly due to the method of generating alerts. For example, alerts are often set that go off when the bandwidth consumption of a gigabit interface is used for 90% or when a hard disk has only 20% space left.

But is it useful to know that bandwidth is going through that specific interface? Maybe someone is watching a video on Netflix that first has to buffer. Is it useful to know that a hard drive only has 20% space left? If it is a small disk it might, but if it's a 2TB disk then this does not seem worth mentioning.

Future-proof approach to Monitoring and Alerts

The only way to generate correct alerts, which actually require action, is based on metrics. Metrics, metrics and i say again: metrics. You can collect these metrics via SNMP Polling on the devices and with the collected data a trend line can be mapped out. Using the above-mentioned examples, we can determine on the basis of a trend analysis whether there is more often 90% consumption on the Gigabit interface. We can also determine how long it takes before a disk is really full and whether it requires action now.

Doors are opened by collecting metrics. Much more insight into what is happening on the infrastructure, outages can be prevented instead of resolved, better advice can be given in regards to capacity and growth path and it makes it easier for administrators.

The last huge difference is that all alerts are displayed on the same (unique) way. We are no longer dependent on the different vendors!

Steps to improve or setup network monitoring

  1. Draw the network infrastructure in a real-time map view
    No network administrator likes to make network drawings. Make it fun by playing with the real-time display of trends and data and at the same time gain insight into the network and up-to-date documentation.
  2. Transform from network monitoring to service and chain monitoring
    By gaining insight into the ultimate availability of the service, it is also clear what the impact is for the customer
  3. Reclassification of alerts based on impact.
    Considering point 2, we now know what the impact is for a customer. Many services are performed redundantly, so that the impact is less. Combine this with point 1 and you have a real-time view of where the disruption occurs without first having to search for half an hour on the various equipment.
  4. Create a performance baseline
    By measuring the services of the customer on performance (response time) in combination with the availability and response time of the network, it can quickly be determined whether there is a congestion.
  5. Work with trend analysis and forecast alerting
    By making use of all collected data, many false alerts can be prevented. With this data an indication can also be made about capacity and availability in the future.

Benchmarking SSDs with fio

Fio which stands for Flexible I/O Tester is a free and open source disk I/O tool used both for benchmark and stress/hardware verification that i mainly use for benchmarking ceph or specific ssd harware.

When using an SSD make sure it's pre-warmed. This can be done using the dd command:

dd if=/dev/zero of=/dev/xvdb bs=100M &

After this you can start performance measurement with fio. My advice is to run this test for 6 to 8 hours in order to get real data out of it.

fio --filename=/dev/nvmeXnXpX --direct=1 --rw=randwrite --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=128k --iodepth=16 --numjobs=1 --time_based --runtime=86400 --group_reporting –-name=benchtest

This command will run for 24 hours and perform write-only workload of 128k blocks on a single process.

Random Read test

sudo fio --name=randread --ioengine=libaio --iodepth=16 --rw=randread --bs=4k --direct=0 --size=512M --numjobs=4 --runtime=240 --group_reporting

This will use 4 processes, run for 2 minutes and only perform read iops.

Random Write test

sudo fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

This will to a read/write test on a 4 GB file.