Brad Peczka's Blog $ cat /dev/random > /dev/blog

10Oct/110

How to upgrade JUNOS on the SRX100

I've had an ongoing love affair with the Juniper SRX Series Gateways for the last year, ever since Juniper were kind enough to run a promotional event with WAIA which resulted in the SRX100B units being offered for sale at a rather cheap price.

One of the (minor) failings of the SRX100B is its limited storage, which can be a problem when it comes time to upgrade JUNOS. Fortunately, JUNOS provides the ability to load a system image off HTTP, or via a USB Drive. Here's how:

Preparation:
You need to download the appropriate JUNOS image from the Juniper support website. You then need to host this on a Web Server which is accessible from the SRX, or copy it to a USB Drive (depending on which route you're taking).
Firstly, we run request system storage cleanup. This will tidy up the onboard flash by deleting old log files, crash dumps, and temporary files.
We also run request system software delete-backup (generally not necessary on the SRX100, but it's here for completeness).

For HTTP:
To upgrade via HTTP, you need to run request system software add no-copy no-validate unlink http://yourwebsitehere/junosimage.tgz

For USB:
Insert your USB drive into the USB ports on the SRX. Then, drop into the FreeBSD Shell by running the start shell command. Then, run ls /dev/ to locate the drive label (usually da0s1, da1s1 or similar).

Make a temporary mount point by running mkdir /tmp/usb, and mount the USB drive mount -t msdosfs /dev/drivelabel /tmp/usb

Once that completes, exit the shell by typing exit and then install the image by running request system software add no-copy no-validate unlink /var/usb/junosimage.tgz

All going well, it looks something like this:
brad@srx> request system software add request system software add no-copy unlink /var/usb/junos-srxsme-10.4R6.5-domestic.tgz
/var/tmp/incoming-package.52396 1149 kB 1149 kBps
Package contains junos-10.4R6.5.tgz ; renaming ...
NOTICE: Validating configuration against junos-10.4R6.5.tgz.
NOTICE: Use the 'no-validate' option to skip this if desired.
Formatting alternate root (/dev/da0s1a)...
/dev/da0s1a: 297.9MB (610028 sectors) block size 16384, fragment size 2048
using 4 cylinder groups of 74.47MB, 4766 blks, 9600 inodes.
super-block backups (for fsck -b #) at:
32, 152544, 305056, 457568
Checking compatibility with configuration
Initializing...
Verified manifest signed by PackageProduction_10_4_0
Verified junos-10.4R3.4-domestic signed by PackageProduction_10_4_0
Using junos-10.4R6.5-domestic from /altroot/cf/packages/install-tmp/junos-10.4R6.5-domestic
Copying package ...
Saving boot file package in /var/sw/pkg/junos-boot-srxsme-10.4R6.5.tgz
Verified manifest signed by PackageProduction_10_4_0
Hardware Database regeneration succeeded
Validating against /config/juniper.conf.gz
cp: /cf/var/validate/chroot/var/etc/resolv.conf and /etc/resolv.conf are identical (not copied).
cp: /cf/var/validate/chroot/var/etc/hosts and /etc/hosts are identical (not copied).
mgd: commit complete
Validation succeeded
Validating against /config/rescue.conf.gz
mgd: commit complete
Validation succeeded
Installing package '/altroot/cf/packages/install-tmp/junos-10.4R6.5-domestic' ...
Verified junos-boot-srxsme-10.4R6.5.tgz signed by PackageProduction_10_4_0
Verified junos-srxsme-10.4R6.5-domestic signed by PackageProduction_10_4_0
Saving boot file package in /var/sw/pkg/junos-boot-srxsme-10.4R6.5.tgz
JUNOS 10.4R6.5 will become active at next reboot
WARNING: A reboot is required to load this software correctly
WARNING: Use the 'request system reboot' command
WARNING: when software installation is complete
Saving state for rollback ...

Finally, run request system reboot, and watch as your SRX reboots with your new version of JUNOS!

Tagged as: , , No Comments
18May/112

NetApp SnapMirror issues with Riverbed Steelhead

I had a curly issue recently, involving two NetApp filers that were replicating over a Steelhead-optimised WAN.

The issue was that Snapmirrors weren't replicating between the two filers. Transfers would appear to be running, but would 'run' at the paltry rate of 5bytes/sec and transfer a dismal 5MB over 24 hours. The network between the two filers, while congested, had been in place for some time prior and there had previously been no issues with the replication of Snapmirrors. At times we were able to replicate as much as 1.3GB of data before the throughput vanished.

Some creative research led me to an old thread on the NetApp Community Forums. According to one of the responses, the issue had been seen before, and was the result of a TCP Windowing issue that caused the filers to reject altered packets - packets that had been optimised by the Steelhead. The workaround was to stop the replication on the filers, create a Pass Through rule on each Steelhead for connections between the two filers on port 10566, and restart the transfer.

Following the advice resulted in our transfers jumping back to their previous levels of ~200KB/s - 300KB/s, which resolved the issue. I'll be lodging a bug report with Riverbed, and will update this post in the event they update RiOS.

Tech Info: Both Steelheads were installed in an in-path configuration, running RiOS 6.1.3. The Filers were a FAS2020A running DOT 7.3.1.1P2, and a FAS3050C running DOT 7.3.3.

2May/110

WordPress Automatic Upgrade via SSH

WordPress is a great blogging platform, but it's automatic upgrade kinda sucks. Out of the box, you've got the option of FTPS (bad) or FTP (even worse). I'm not a fan of installing FTP daemons if I don't absolutely have to, and it seems a pity that there's no default support for upgrades via SSH.

Fortunately, someone's already thought about this and found a way to do it!

Check out this nifty little how-to over at kbeezie.com, which lists the relevant changes you'll need to make to your wp-config.php file in order to get it working. You'll also need to ensure that your web server user (www-data on most Linux systems) has appropriate permissions and access to your WordPress directory in order to perform the upgrade.

2May/116

IBM NICs in VMWare ESXi 4.0/4.1

A new IBM x3650 M3 landed on my desk last week, and I've spent some time having a play around with it.

The x3650 M3 is a nice machine, improving on the original x3650 and x3650 M2, and seems to retain most of the features and quality build that IBM (not Lenovo!) has been known for. I'm not too sure about the new IMM over the old IBM RSA (looks the same, works differently) - but that's a topic for another blog post.

This particular x3650 M3 came with a pair of IBM I340-T2 Dual Port Ethernet Adapters - standard Intel 82580 GigE chips, nothing special, however they weren't being detected by ESXi even though we were using the IBM customised installation. (For reference, the IBM part number of the affected cards is 49Y4230.) Researching further on the VMWare HCL, it appeared that the NICs are supported but drivers aren't included by default on the ESXi media. You need to manually download the driver packages from VMWare, and inject the driver into ESXi in order to utilise the NICs. These NICs can be ordered with any IBM System x Server, so you can run into this issue on anything from an x3250 through to an x3950.

You'll be needing the vSphere CLI, and the ESX/ESXi 4.x Driver CD for Intel 82576 and 82580 Gigabit Ethernet Controller. Then, mount the ISO, extract the driver bundle from the 'offline-bundle' folder, and run the following commands in the vSphere CLI (being sure to substitute my server IP for your server IP, and change the path to the install bundle if required):

vicfg-hostops.pl --server 172.16.0.10 --operation enter

vihostupdate.pl --server 172.16.0.10 --install --bundle INT-intel-lad-ddk-igb-2.1.10.2-offline_bundle-268793.zip
Please wait patch installation is in progress ...
The update completed successfully, but the system needs to be rebooted for the changes to be effective.

vicfg-hostops.pl --server 172.16.0.10 --operation reboot

vicfg-hostops.pl --server 172.16.0.10 --operation exit

20Sep/102

MGCP Voice Gateway Configuration in CUCM

I always remember one of these commands when setting up an MGCP Gateway in CUCM - hopefully putting them here will help someone else in the future!

! Let's ensure the proper host and domain names are set
Router(config)#hostname VoiceGateway
VoiceGateway(config)#ip domain-name mydomain.com
! Enable MGCP
VoiceGateway(config)#mgcp
! Specify the IP Address of our Primary Subscriber
VoiceGateway(config)#mgcp call-agent 10.10.10.10
VoiceGateway(config)#ccm-manager mgcp
! Specify the IP Addresses of our backup CUCM Servers (Secondary Subscriber, Publisher)
VoiceGateway(config)#ccm-manager redundant-host 10.10.20.10 10.10.10.11
! Specify the TFTP Servers (used by CUCM to deploy XML Config files - most important!)
VoiceGateway(config)#ccm-manager config server 10.10.10.10 10.10.20.10 10.10.10.11
! Enable the XML Config Service
VoiceGateway(config)#ccm-manager config
! That's it!
VoiceGateway(config)#end

So now your Gateway is configured and will be polling the TFTP Servers for an XML configuration file. This file is generated once the Gateway is added in CUCM, is refreshed each time a change is made, and is pushed to the Gateway via TFTP whenever the config is applied or the Gateway is reset. To check the status of the config download, run the 'sh ccm-manager' command from the Gateway:

VoiceGateway>sh ccm-manager
MGCP Domain Name: VoiceGateway.mydomain.com
Priority        Status                   Host
============================================================
Primary         Registered               10.10.10.10
First Backup    Backup Ready             10.10.20.10
Second Backup   Backup Ready             10.10.10.11

Current active Call Manager:    10.10.10.10
Backhaul/Redundant link port:   2428
Failover Interval:              30 seconds
Keepalive Interval:             15 seconds
Last keepalive sent:            05:48:52 UTC Sep 20 2010 (elapsed time: 00:00:13)
Last MGCP traffic time:         05:48:52 UTC Sep 20 2010 (elapsed time: 00:00:13)
Last failover time:             11:50:58 UTC Sep 2 2010 from (10.10.10.10)
Last switchback time:           12:13:50 UTC Sep 2 2010 from (10.10.20.10)
Switchback mode:                Graceful
MGCP Fallback mode:             Enabled/OFF
Last MGCP Fallback start time:  10:10:28 UTC Sep 2 2010
Last MGCP Fallback end time:    11:51:15 UTC Sep 2 2010
MGCP Download Tones:            Disabled
TFTP retry count to shut Ports: 2

Backhaul Link info:
    Link Protocol:      TCP
    Remote Port Number: 2428
    Remote IP Address:  10.10.10.10
    Current Link State: OPEN
    Statistics:
        Packets recvd:   7382
        Recv failures:   0
        Packets xmitted: 6512
        Xmit failures:   0
    PRI Ports being backhauled:
        Slot 0, VIC 1, port 0
Configuration Auto-Download Information
=======================================
Current version-id: 1284616589-a3cd44fe-86bb-486f-a62e-a78bf2a71840
Last config-downloaded:00:00:00
Current state: Waiting for commands
Configuration Download statistics:
        Download Attempted             : 11
          Download Successful          : 11
          Download Failed              : 0
          TFTP Download Failed         : 0
        Configuration Attempted        : 6
          Configuration Successful     : 1
          Configuration Failed(Parsing): 0
          Configuration Failed(config) : 5
Last config download command: New Registration
FAX mode: cisco
Configuration Error History:
ccm-manager music-on-hold
end

Take note of the error count shown above. This Gateway had been configured beforehand, and started generating errors when attempting to download the XML Config. It turned out that the router already had the 'ccm-manager music-on-hold' statement applied, and so running a 'no ccm-manager music-on-hold' command was all it needed to complete a successful download.

Finally, another tip most people forget is that the MGCP Domain Name must match the Gateway Domain Name as configured in CUCM - if it's not the same, the Gateway won't register. Just a little thing to keep in the back of your mind as you're configuring your Gateways... good luck! :-)

23Aug/100

Disabling the CUCM Corporate Directory

Situation: You want to disable the Corporate Directory on your Cisco IP Phones - maybe you're like me, and have ~300 phones in places where people shouldn't be able to look up the Managing Director's direct line.

Solution: You come across this awesome little post from the Chesapeake Netcraftsmen, which says (amongst other things) to assign these phones a new Common Phone Profile, configure Services Provisioning to use a External URL, and then disable Enterprise Subscription on the Corporate Directory. Right?

... not quite. In a situation where the Corporate Directory is required on more phones than not, having to subscribe each phone to it is a pain in the proverbial. However, if you plug a dummy URL into the 'Directory' Data Location on your phone configuration, it also wipes out your Missed/Placed/Received calls directories. Cisco's website is unsurprisingly sparse on where this information is pulled from, and how to get it back if you're in my situation.

Dumping the Console Log on a phone shows that the phone requests a phoneservices.xml file upon boot, which lists the Services that the Phone is subscribed to. This file looks very similar to the format of the http://cucmpublisher:8080/ccmcip/xmldirectory.jsp file (which provides the Personal/Corporate Directory URLs), other than it doesn't actually have any URLs in it.

On a whim, I cooked up an XML file with similar contents to the phoneservices.xml file, hosted it on a test server, and pointed a test phone to this file via their 'Directory' URL. The phone reloaded, and the Missed/Received/Placed Calls directories were back in business. You can get a copy of the file here, or view the actual code after the break.

6May/101

Managing Volume License Keys

I spotted this little gem of a post while trawling Twitter for something totally unrelated, and realised that I've been looking for a tool like this for a number of years.

The tool in question is the Volume Activation Management Tool (VAMT) 2.0, a product which (in a nutshell) makes light work of managing all the keys you get as part of, say, your Technet Plus subscription. You import your keys, and your machines, and the app tracks how many activations you have left on each key as well as showing which machine is running what software.

Great stuff - and don't forget to check out the stack of relevant literature available about VAMT on the Microsoft website.

(Sidebar: Aaron Parker's Stealthpuppy Blog is a great resource, and will probably be the blog that gets me using an RSS Reader one of these days.)

6May/101

When the Citrix IMA Service fails to start…

... you run, very quickly, as your phone is about to start ringing.

In all seriousness, this one has bitten me on a few separate occasions. Always seems to occur immediately after a reboot, and there's generally no way to replicate it or 'cause' the issue. I've had servers run for weeks without any sign of it, and servers that have it crop up on every restart. Extensive reading of the Citrix and Microsoft KBs has revealed zilch in the way of a permanent fix, with the 'band-aid' being the solution outlined below.

Here's hoping they figure it out soon!

Scenario:
HP BL460c G1, Windows Server 2003 Standard 32-bit w/ Service Pack 2, Citrix Presentation Server 4.5 w/ Rollup Pack 6.

Symptoms:

Cause:
The folder "Documents and Settings\NetworkService\Application Data\Microsoft\Crypto" does not exist.

Solution:

  • Check the if the above folder exists. If not, create it on the affected server.
    (Bootnote: You'll need to ensure you can view Hidden Files and Folders in order to see the NetworkService folder.)
  • Kill the process 'mfcom.exe' via the Task Manager
  • Start the Citrix Independent Management Architecture Service
  • Start the Citrix MFCOM Service
  • Start the Citrix SMA Service (though, I've noticed this often start up by itself, so you may not have to do this)

Credits to my esteemed colleague Grant for finding the fix.

Tagged as: , , 1 Comment
7Jan/100

Exchange 2007 Services Shutdown Order

Following on from my earlier post regarding the fun and games I've had with Exchange 2007, here's a brief running sheet I use when I want to shut down Exchange services, but keep the server running.

This is especially handy when you're performing network adapter driver updates, and your Exchange Information Store is hosted on an ISCSI LUN. Driver updates while the Store is still running == weird, weird issues and potential Store corruption!

net stop msexchangeadtopology /y
net stop msftesql-exchange /y
net stop msexchangeis /y
net stop msexchangesa /y
net stop iisadmin /y

Once these services have shutdown, you're free to proceed with any driver updates or cable pulling, or to continue shutting down other services on the same server.

7Jan/102

Troubleshooting Fun with Exchange 2007 Queues

Exchange 2007 LogoI recently resolved an issue, involving two Exchange 2007 servers in two different AD Sites.  The issue was simply slow email delivery when emailing from Site 'A' to Site 'B', and a quick check showed that both servers had backlogged mail queues with no obvious cause.

Both sites are part of the same domain, both servers are identical in hardware (HP DL380 G5) and patch levels (Windows Server 2003 Standard x64 R2, and Exchange 2007 SP2). Connectivity between both sites tested perfectly, and talking to other servers in each site also revealed no issues. It was only when both the Exchange servers attempted to communicate, that the issue occured.

Mail in both queues reported errors of "451 4.4.0 Primary target IP address responded with: "421 4.4.2 Connection dropped." Attempted failover to alternate host, but that did not succeed. Either there are no alternate hosts or delivery failed to all alternate hosts." or "421 4.4.2 Connection dropped.", which seemed to point to network issues. Packet captures from both servers also showed a large amount of retransmits on both SMTP and SMB communication:

SMTP:

338     XXXMAIL02    192.168.15.63   SMTP  SMTP:Cmd EHLO XXXMAIL02.testdomain.com, 31 bytes
1197   192.168.15.63   XXXMAIL02    SMTP  SMTP:Rsp 250 -YYYMAIL02.testdomain.com Hello [192.168.24.34], 255 bytes
1198  XXXMAIL02    192.168.15.63   SMTP  SMTP:Data Payload, 16 bytes
4159   XXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #1198] [Bad CheckSum]Flags=...AP..., SrcPort=44217, DstPort=SMTP(25), PayloadLen=16, Seq=3183382952 - 3183382968, Ack=1774779495, Win=65181
8142   XXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #1198] [Bad CheckSum]Flags=...AP..., SrcPort=44217, DstPort=SMTP(25), PayloadLen=16, Seq=3183382952 - 3183382968, Ack=1774779495, Win=65181
11786  XXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #1198] [Bad CheckSum]Flags=...AP..., SrcPort=44217, DstPort=SMTP(25), PayloadLen=16, Seq=3183382952 - 3183382968, Ack=1774779495, Win=65181
15476  XXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #1198] [Bad CheckSum]Flags=...AP..., SrcPort=44217, DstPort=SMTP(25), PayloadLen=16, Seq=3183382952 - 3183382968, Ack=1774779495, Win=65181
17902  XXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #1198] [Bad CheckSum]Flags=...AP..., SrcPort=44217, DstPort=SMTP(25), PayloadLen=16, Seq=3183382952 - 3183382968, Ack=1774779495, Win=65181
20735  XXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #1198] [Bad CheckSum]Flags=...AP..., SrcPort=44217, DstPort=SMTP(25), PayloadLen=16, Seq=3183382952 - 3183382968, Ack=1774779495, Win=65181
23227  XXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #1198] [Bad CheckSum]Flags=...AP..., SrcPort=44217, DstPort=SMTP(25), PayloadLen=16, Seq=3183382952 - 3183382968, Ack=1774779495, Win=65181

SMB:

1/5/2010 15:22        14560  {TCP:358, IPv4:16}  XXXMAIL02     192.168.15.63   SMB     SMB:R; Negotiate, Dialect is NT LM 0.12 (#5), SpnegoNegTokenInit
1/5/2010 15:22        14650  {TCP:358, IPv4:16}  XXXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #14560] [Bad CheckSum]Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=44946, PayloadLen=186, Seq=3446414444 - 3446414630, Ack=2070264315, Win=65398 (scale factor 0x0) = 65398
1/5/2010 15:22        14943  {TCP:358, IPv4:16}  XXXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #14560] [Bad CheckSum]Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=44946, PayloadLen=186, Seq=3446414444 - 3446414630, Ack=2070264315, Win=65398 (scale factor 0x0) = 65398
1/5/2010 15:23        15334  {TCP:358, IPv4:16}  XXXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #14560] [Bad CheckSum]Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=44946, PayloadLen=186, Seq=3446414444 - 3446414630, Ack=2070264315, Win=65398 (scale factor 0x0) = 65398
1/5/2010 15:23        15862  {TCP:358, IPv4:16}  XXXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #14560] [Bad CheckSum]Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=44946, PayloadLen=186, Seq=3446414444 - 3446414630, Ack=2070264315, Win=65398 (scale factor 0x0) = 65398
1/5/2010 15:23        16383  {TCP:358, IPv4:16}  XXXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #14560] [Bad CheckSum]Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=44946, PayloadLen=186, Seq=3446414444 - 3446414630, Ack=2070264315, Win=65398 (scale factor 0x0) = 65398
1/5/2010 15:23        17225  {TCP:358, IPv4:16}  XXXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #14560] [Bad CheckSum]Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=44946, PayloadLen=186, Seq=3446414444 - 3446414630, Ack=2070264315, Win=65398 (scale factor 0x0) = 65398
1/5/2010 15:24        18568  {TCP:358, IPv4:16}  XXXXMAIL02    192.168.15.63   TCP     TCP:[ReTransmit #14560] [Bad CheckSum]Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=44946, PayloadLen=186, Seq=3446414444 - 3446414630, Ack=2070264315, Win=65398 (scale factor 0x0) = 65398

Revisiting the issue, it was noticed that XXXMAIL02 had two network adapters in a Team, while YYYMAIL02 was running off a single network adapter. Both servers also had old network card drivers (the cards are HP NC373i Multifunction Gigabit Adapters, which are rebadged Broadcom cards, and were using driver v2.8.13.0 made on 30/06/2006), and as part of the troubleshooting we upgraded these drivers to the latest available versions (v5.0.13.0, 23/06/2009) at the next maintenance window. As part of the upgrade, XXXMAIL02 was changed from a Network Team to a single adapter, to match YYYMAIL02.

(Bootnote: We did the upgrade by installing the latest Proliant Support Pack, and ran into a small issue of note while doing so. You can't upgrade the network drivers straight to v5.0.13.0, otherwise the installation will fail with an error "HP Virtual Bus Device installation requires a newer version. Version 4.6.16.0 is required". The easy way around this is to download v4.6.16.0 from HP (64-bit here, 32-bit here), and install this prior to the running the PSP.)

HP Network Drivers

Within minutes of the upgrade being completed, mail and other traffic was flowing freely between both servers. A speedtest was run using iperf, which showed speeds of ~60Mb/s (previously we were seeing ~557bytes/s), and new emails were being delivered to the server within seconds.

HP iperf Test

This was a tricky one to diagnose - but it proves how often simple things are overlooked, in search of a bigger problem!