PlusNet DNS issues - Answers
Article posted on Monday, 09-Jan-2006 19:27 PM
In answer to the issues raised here and first reported here Stewart Norriss has come back to the usergroup with information on the DNS issues over the Christmas period.
What went wrong in the first place?
"This was an issue with new accounts only and we raised a P1 [top priority task.] Networks looked at this and identified that the cause was [a component called] "componentadd" hadn't been running. They fixed the reason for that not running, and noticed that "componentadd" didn't pick up any new accounts for the affected period. So found that the components needed to be refreshed, so Dave T took 2 random accounts and put the components into queued- reactivate (or similar) states and networks re-ran "componentadd" and confirmed that (according to it's log file) it picked up the account, created web space and added DNS entries into whatever database it adds into. So we concluded that we needed to do the same for all users."
Why did it take so long to fix?
"Dev called networks the next day to confirm the issue and the proposed solution - they then asked if we could run this solution on another 2 test accounts, which we did - networks then confirmed that the output from "componentadd" suggested all was well, but stated that they couldn't tell if DNS was updated, as there's a lead time. They then manually checked that mail and webspace had been created for the 2 test accounts, which it had, so once again, all looked good. At this point, networks let "componentadd" run and then picked 5 specimen accounts from the list and manually confirmed that mail and webspace had physically been created, and they had for all 5 accounts - this was correct and fine, pending DNS lead time, so networks updated the PUG [PlusNet UserGroup] on the issue stating that they believed it ought to be fixed in the next hour, pending the relevent lead time for DNS afterwards.
No further issues were raised and the problem was indeed fixed at this point just pending lead time.
It did not take long to fix really, however there were some delays due to the Christmas period."
Why were most CS staff not aware that there was a problem.
"The CSC were made aware but perhaps some agents need to be educated a little more on how things work and what problems mean and I will look at this process."
Why had no service status announcement been made and customers not informed.
"I think we dropped the ball there and I will talk to my agents as I was not here during this period."