Forums / General / How to speed up import of 1M objects?
zurgutt -
Wednesday 22 December 2010 3:38:19 pm
I have to migrate lots of content from one ez installation to other (4.0 -> 4.3). It is not a straight upgrade, there are custom scripts to convert objects to new classes etc.
Problem is, there is nearly a million objects, so while export runs at reasonable speed, the import/publish operations are slow and by my estimates would take days to finish.
I can dedicate a server for this operation and tune it specificly. It is a reasonably fast box with Xeon [email protected] and 12G of ram.
Can you suggest any specific tuneups or tricks to temporarily speed up insert/publish operations for the duration of import?
Certified eZ developer looking for projects. zurgutt at gg.ee
Jérôme Vieilledent
Wednesday 22 December 2010 10:03:20 pm
Hi Zurgutt
SQLIImport tunes up some performance settings for imports such as :
Once the import process is over, a cleanup cronjob runs to clear the cache and trigger indexing.
If you're not using this extension, maybe you should consider it. You could do your transformation stuffs in your important handler :)
Ivo Lukac
Thursday 23 December 2010 4:41:51 am
I second everything what Jerome wrote. With additional few notes:
1. most important thing is to spread nodes over lot of parent nodes. We had lot of bad experience with importing thousands of objects under same node as single publish is a bit slower with every new sibling. I didn't have time to investigate why is that, maybe it can be avoided somehow...
2. to reduce single publish try to hack temporary "publish" operation definition in kernel/content/operation_defintion.php and remove every method that is not crucial, like: post_publish, remove-temporary-drafts, create-notification, register-search-object, generate-object-view-cache, clear-object-view-cache, pre_publish.Maybe even some others. You need to know exactly what you are doing, of course. Try different hacks with couple of thousands and measure the single average publish time....
http://www.linkedin.com/in/ivolukac http://www.netgen.hr/eng/blog http://twitter.com/ilukac
gilles guirand
Thursday 23 December 2010 1:22:55 pm
I agree,
@Ivo : When you tell "hack" : you mean execute a specific static PHP method and/or unset some INI values before importing datas, i guess :) ?
-- Gilles Guirand eZ Community Board Member http://twitter.com/gandbox http://www.gandbox.fr
Tuesday 28 December 2010 3:18:31 am
No, with hack I mean go to kernel/content/operation_defintion.php and comment out some parts of publish method :) temporary just for importing
Tuesday 28 December 2010 5:43:44 am
Aditionaly, it could be lucrative performance wise to hack out some features (e.g. browserecent, etc), but generally I think those should be possible to disable through ini settings.