Wednesday, July 1, 2009

A Day at the Office

Having trouble picturing what I do at work all day? I get e-mails like this, chock full of jargon. I'll translate below. Names have been changed to protect the innocent and avoid lawsuits filed by the guilty.



From: Mr A
Sent: Wednesday, July 01, 2009
To: ~Reroute Group
Subject: XYZ reroute

I put a skip on XYZ today as a result of troubles we are having with them. We are getting 503 messages with delay of 4 or more seconds causing PDD on the reroute. We are also getting 504 messages indicating gateway timeout after 20 seconds. The 504 messages are dead air calls if they are reported. From the hammer it looked like 10% of the calls to XYZ are having this problem. We also have a number of trouble tickets on calls trying to route to XYZ.

Mr B/NOC is working with XYZ to resolve this problem. When they fix the problem we can increase the SIM’s on trunk group XXXX.

Mr A



Context:

My company provides long distance service to other companies at a wholesale level. We do this by connecting to several other networks, known as carriers. In this e-mail, XYZ is one of our underlying carriers.

Mr A is Translations Engineer for my employer. He decides where calls should route on our network.

Reroute group is a handful of people - like me - who need to know when we have network issues.

A skip is an override in our Lowest Cost Routing tables (LCR). The skip tells the network to treat XYZ as if it doesn't exist. The LCR is a database table that the network uses to determine the cheapest way to route a call.

A 503 message is an error code that you would see if you were logged into a Translations terminal. It means calls are talking too long to connect.

PDD is Post Dial Delay. When you dial a phone number, the amount of time after you enter the 10th digit until the destination switch sends a signal that it is ringing the call is known as the post dial interval. Ideally, the post dial interval should be measured in milliseconds. If you have to refer to the post dial interval as PDD, that's bad. PDD of four or more seconds is very bad. PDD that lasts until the timeout limit of 20 seconds is monstrously bad.

A 504 message is an error code indicating that a carrier has taken a call but hasn't connected it nor have they sent back any indication why the call isn't connecting. This is not a message you want to see. After 20 seconds, the network times out, or stops trying to connect the call.

The Hammer. I have no idea what that means.

10% means that XYZ is connecting most calls (90%) just fine but is trying to offload some of the calls to another carrier and not succeeding. The calls they are not connecting are probably ones that cost a lot to terminate. XYZ has their own LCR and it is busy trying to offload the calls to another carrier who doesn't know how much the calls cost to terminate, aka a sucker. In theory, XYZ should begin termination of all calls in a few milliseconds and do it on their network. In practice, they'll spend a few seconds trying to find another sucker to do it. That 504 error tells us they didn't find a sucker and they won't do it themselves.

A trouble ticket is a way to report and track troubles.

Mr B is the supervisor of the Network Operations Center. When stuff breaks on the network, the NOC fixes it.

SIM stands for simultaneous call. It's a measure of capacity - how many calls are being simulaneously fed into a trunk group.

A Trunk Group is a circuit going from point A to point B, in this case from our network to XYZ's.

The last paragraph can be summarized thusly: Mr Y will see if XYZ is aware of the problem or not. If it's intentional, we stop sending them calls for a while. It may be unintentional, which means that when they fix the problem on their network, Mr X will increase the number of calls we send them (remove the skip).

This is an ordinary message from an ordinary day for me. There's plenty more jargon where this came from.

No comments:

Post a Comment