The return of “deploy duplicate binary” and Publishing crashing

Scenario:
We have had an interesting issue where when we sent a large batch of components and pages for publishing in a SDL Web Cloud and DXA environment – After publishing many of these items, the publishing got stuck in different phases like “Waiting for publishing”,
“Waiting for Deployment” etc. and after few hours eventually it failed for all items.

Analysis:
Different teams were analyzing and working to resolve this to find that few of those transactions (1 in 100s) is giving the infamous “deploy duplicate binary” error which is quite weird while DXA is in use.

After further analysis, it turned out that in some cases this single failing of multimedia component publishing eventually crashing the entire Publishing ecosystem and failed all items in the entire publishing queue. Any further publishing would also starts stuck in the queue and eventually fails.

Root Cause:
The failed multimedia is big in size (~ 100 MB PDF files) and stay in memory till default retry of 10 times – this takes some time and keep eating the memory. At the same time if a big chunk of items is waiting in the queue for publishing, this has adverse effect on the memory utilization which further leads to the crashing of the publisher due to “Out of Memory”

Moral of the story:
It seems the publishing queue is completely transaction based and execute each transaction batch in isolation, however, there seems some unexpected factors which may affects the publishing process – So a failure of an item in a publishing queue CAN cause failure of other batches of publishing items.

 

Advertisements

Director at Content Bloom India having 12+ years of experience in Software Development Life Cycle using AGILE, Iterative and RUP approaches. Experience in following: - CMS packages: SDL Tridion, Adobe Experience Manager (AEM), Sitecore, Umbraco, Kentico, and Alfresco - Search Engines: SOLR, AWS Cloud Search, Elastic Search - .NET Technologies: .NET & .NET CE Framework, ASP.NET, ASP.NET MVC, WCF, WinForms - Mobile Development: Android Native App, Windows Mobile App - Database: MS-SQL Server, MySQL - Program Management: JIRA, MS-Project, Trello - Design Tools: MS-Visio, StarUML - Infrastructure: Linux, Windows Server, AWS Have decent knowledge about Core Java, Spring MVC Instrumental in Application Architecture, Designing (HLD & LLD), Coding and deployment .NET applications (Web, Desktop, Mobile). Experience in following domain: - Digital Media & eCommerce - Travel & Hospitality - Aviation Industry - Education - Insurance - Automation - Automobile - Railways Education: Bachelor Degree in Computer Engineering and Post Graduate Diploma in Business Administration with specialization in Marketing

Posted in SDL Tridion
4 comments on “The return of “deploy duplicate binary” and Publishing crashing
  1. Vikas Kumar says:

    Yeah, who better than us know it.. We had fair amount of brain storming in this. The best part is that we are aware of this now and good one to bring in notice to all readers. Thanks Pankaj.

    Btw – wasn’t it also the the way multimedia naming and large file paths of them were also the reason?

  2. Pankaj Gaur says:

    Hey Vikas, yes it was a great team work and thanks for reading 🙂
    The mutimedia naming and large file paths were the cause of deploy binary error due to the fact that in Tridion Database the Variant ID column has been restricted to accept only 64 characters – I wanted to bring the issue that ONE failure in publishing queue may result in entire publishing queue failure.
    For the large file names restriction, use of variant ID and this causing the deploy binary issue, I would write another blog 🙂

  3. Vikas Kumar says:

    aah.. Great. Thanks 🙂

  4. Thank you for posting this article, recently we faced same issue once we migrated to SDL Web 8.1 and using deployer service once few items failed due to template error, the other items also started failed in the publishing queue. After we contacted SDL support team some how we are unable to reproduce the error. Based on your post we need to reproduce the error and resolve with SDL support team.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: