Inefficient access to language texts?

avenger · November 2, 2009, 7:24am

In the OXID software, language constants are defined in an associative array (lang.php).

They are referenced in the template files like “[{ oxmultilang ident=“COMPARE_PRODUCTATTRIBUTES” }]”

As accessing an associative array is extremely time-consuming (especially if the keys are not sorted to allow a binary search algorithm), and knowing, that most any programmin language can handle constans much more effixciently than literals, I was wondering, if defining the language constants as PHP constants (like xtCommerce does) would not be much more efficient.

Now the retrieval of each language constant requires calling a Smarty function plugin (“oxmultilang”) with some significant PHP code behind it, whereas .defining the language constants as PHP constants is compiled by Smarty to a direct (and thus much more efficient) PHP constant access.

These constants would be used in the templates like “[{$smarty.const.COMPARE_PRODUCTATTRIBUTES }]”.

Smarty will compile the access to the current language texts to

<?php echo smarty_function_oxmultilang(array('ident' => 'COMPARE_PRODUCTATTRIBUTES'), $this);?>

whereas the second approach would result in

<?php echo @COMPARE_PRODUCTATTRIBUTES;?>

In order to determine the efficiency of accessing the language texts in both ways, I did a timing study.

Besides the already defined language texts in the associative array, I additionally “defined” all language texts as PHP constants, like

define('ACCOUNT_LOGIN_LOGIN','Anmeldung');
define('ACCOUNT_LOGIN_BACKTOSHOP','Zurück zum Shop');
define('ACCOUNT_MAIN_TITLE','Mein Konto');
define('ACCOUNT_MAIN_BACKTOSHOP','Zurück zum Shop');
define('ACCOUNT_NEWSLETTER_TITLE','Newsletter');
define('ACCOUNT_NEWSLETTER_LOCATION','Mein Konto / ');
define('ACCOUNT_NEWSLETTER_SETTINGS','Newslettereinstellungen');
define('ACCOUNT_NEWSLETTER_SUBSCRIPTIONSUCCESS','Der Newsletter wurde abonniert.');
define('ACCOUNT_NEWSLETTER_SUBSCRIPTIONREJECT','Der Newsletter wurde abbestellt.');
define('ACCOUNT_NEWSLETTER_SUBSCRIPTION','Newsletter abonnieren:');
define('ACCOUNT_NEWSLETTER_YES','Ja');
define('ACCOUNT_NEWSLETTER_NO','Nein');
define('ACCOUNT_NEWSLETTER_SAVE','Speichern');
define('ACCOUNT_NEWSLETTER_BACKTOSHOP','Zurück zum Shop');
define('ACCOUNT_NOTICELIST_TITLE','Mein Merkzettel');
........

Then all language texts names were stored in another array (1390 entries).

$lang_names=array(
  'ACCOUNT_LOGIN_LOGIN',
  'ACCOUNT_LOGIN_BACKTOSHOP',
  'ACCOUNT_MAIN_TITLE',
  'ACCOUNT_MAIN_BACKTOSHOP',
  'ACCOUNT_NEWSLETTER_TITLE',
  'ACCOUNT_NEWSLETTER_LOCATION',
  'ACCOUNT_NEWSLETTER_SETTINGS',
  'ACCOUNT_NEWSLETTER_SUBSCRIPTIONSUCCESS',
  'ACCOUNT_NEWSLETTER_SUBSCRIPTIONREJECT',
  'ACCOUNT_NEWSLETTER_SUBSCRIPTION',
  'ACCOUNT_NEWSLETTER_YES',
  'ACCOUNT_NEWSLETTER_NO',
  'ACCOUNT_NEWSLETTER_SAVE',
  'ACCOUNT_NEWSLETTER_BACKTOSHOP',
.......
};

The timing-test was then executed with the following module:

<?php
$max_i=100;
$_start = microtime();
for ($i=1;$i<=$max_i;$i++)
{
  reset($lang_names);
  foreach ($lang_names as $lang_name)
  {
    $text=smarty_function_oxmultilang(array('ident' => $lang_name), $this);  
  }
}
$diff_a=diff_microtime($_start,microtime());
echo $max_i." accesses to language texts via 'oxmultilang': ".number_format($diff_a, 4)." second<br>
";

$_start = microtime();
for ($i=1;$i<=$max_i;$i++)
{
  reset($lang_names);
  foreach ($lang_names as $lang_name)
  {
    reset($lang_names);
    $text=constant($lang_name);  
  }
}
$diff_c=diff_microtime($_start,microtime());
echo $max_i." accesses to language texts via PHP constants: ".number_format($diff_c, 4)." seconds<br><br>
";
$faktor=$diff_a/$diff_c;
echo "The access to language texts via 'oxmultilang' takes ".number_format($faktor, 0)." times longer than access to language texts via PHP-constants!";
exit();


// diff_microtime *********************************************************
// Calculate the difference between two different microtimes
// I like this better than the `getmicrotime()` option on PHP.net
function diff_microtime($mt_old,$mt_new)
{
  list($old_usec, $old_sec) = explode(' ',$mt_old);
  list($new_usec, $new_sec) = explode(' ',$mt_new);
  $old_mt = ((float)$old_usec + (float)$old_sec);
  $new_mt = ((float)$new_usec + (float)$new_sec);
  return $new_mt - $old_mt;
}
?>

The module iterates 100 times over the code, which assigns all 1390 language texts by their name to a variable using the method Smarty would generate for both cases.

The result is just stunning!

100 accesses to all language texts via ‘oxmultilang’: 27.1311 second
100 accesses to all language texts via PHP constants: 0.1714 seconds

The access to language texts via ‘oxmultilang’ takes 158 times longer than access to language texts via PHP-constants!
This means, that the method of retrieving a language text from the associative language array takes 158 times longer than the method of treating the language texts als PHP constants.

So I believe that it would be worth considering a change here…

Every little helps (for performance)…

dainius.bigelis · November 2, 2009, 8:51am

Hi,

That’s a very good idea. We just need to try this and check if it would not cause any problems in changing language (or how it can be solved). So we will investigate this solution in details.
Thank you for your idea.

Best regards,
Dainius Bigelis

bitconstructor · November 3, 2009, 8:40am

How about using a better translation standard like gettext (or via zend_translate gettext module).

The biggest benefit of gettext is proper support for plural forms, which is not possible with simple strings. It is bundled with php and in case of zend_translate even improves certain disadvantages of native gettext. gettext supports domains (string groups) which can be loaded when needed, this would ease the memory consumption and group strings to more logic entities

e.g. english: 2 plural forms
(1 apple), (2 apples, 0 apples)

german: 2 plural forms
(1 Apfel), (2 Aepfel, 0 Aepfel)

Chinese: 1 plural form
1 apple, 2 apple, 0 apple (in chinese of course)

Croatian 3 (or 5) Plural forms
1 Jabuka, 2+ Jabuke, 5+ Jabuka, 0 Jabuka,…

IMHO, loading all translation strings at runtime, and i mean ALL, doesnt make sense.
oxid is growing and there are a few hundred translation strings already. why occupying memory with something that isnt used?

regards
Tibor

csimon · November 4, 2009, 1:44pm

that would also cause huge problems with modules and defining own languages constants and overwriting them. the way it currently is is more flexible and extendable. A new way should provide same flexibility and some other benefits.

if you use constants, avoid to use “define” and use the builtin PHP 5 way of declaring class constants.

avenger · November 4, 2009, 1:59pm

[QUOTE=csimon;17739]that would also cause huge problems with modules and defining own languages constants and overwriting them. the way it currently is is more flexible and extendable. A new way should provide same flexibility and some other benefits.[/QUOTE]
Not really…

You just have to include “[B]lang.php[/B]” as the [B]last [/B]language definition file, then any redefinition in other “xx_lang.php”-files will be the constant…

[QUOTE=csimon;17739]if you use constants, avoid to use “define” and use the builtin PHP 5 way of declaring class constants.[/QUOTE]
What’s wrong with “define”???

csimon · November 4, 2009, 2:53pm

you declare constants globally for the whole program. This can cause conflicts and isn’t very “IDE friendly” either. Class constants are constants in specific boundaries.

You just have to include “lang.php” as the last language definition file, then any redefinition in other “xx_lang.php”-files will be the constant…

defining a constant twice. will cause a php notice. constants should have a constant value, otherwise it fails the sense of a constant

tassoman · January 16, 2010, 1:19pm

There are lot of duplications on lang.php file, for example “Back to shop” appears more than 5 times.
More, it’s impossible to stay ongoing updated with SVN version that changes each commitment.
Having gettext and using POT file avoids duplicates, compiles strings, make comments on strings, can merge translations. Finally you can use poedit application and not text-editor, you can import dictionary and get raw things done faster.

oxal · January 18, 2010, 1:42pm

Hi Avenger,

I like your constructive checking of the Oxid coding.

[QUOTE=avenger;17456]accessing an associative array is extremely time-consuming (especially if the keys are not sorted to allow a binary search algorithm)[/QUOTE]

As a middle way, would it be possible to pre-sort the array in the lang.php coding, and in the smarty-code then use the binary search algo you are referring to?

Best,
Achim

bofh · March 4, 2010, 1:59am

if you wanna stick with the current method and presort it i woudl recommend another way
make an admin backend where users can edit it easily and load it up to the database
from there you produce the hole file (after each edit)

so you can sort

but i agree i would like s simple define much more

avenger · March 4, 2010, 5:57am

[QUOTE=bofh;26353]if you wanna stick with the current method and presort it i woudl recommend another way
make an admin backend where users can edit it easily and load it up to the database
from there you produce the hole file (after each edit)

so you can sort

but i agree i would like s simple define much more[/QUOTE]
Using a database for language texts will be the ultimately inefficient method…

And extremely clumsy to use…

Gambio GX ist doing this that way (partiallly), and changing an adding text is such a nightmare (several actions in the admin interface), that one of the first things I have done there was to revert to the good old language constants in textfiles…

Presorting would definateley help, but then you would have to look for changes every time you use it…

I still tend to let PHP do it in its optimized access to constants…

marco.steinhaeuser · March 5, 2010, 1:43pm

Hi,

How about using a better translation standard like gettext

FYI: MaFi is about to play around with gettext and it already looks really nice. Hope he finds the time to put this module to projects.oxidforge.org soon.

Regards

bofh · March 6, 2010, 8:56pm

[QUOTE=avenger;26355]Using a database for language texts will be the ultimately inefficient method…

And extremely clumsy to use…
…
…
Presorting would definateley help, but then you would have to look for changes every time you use it…

I still tend to let PHP do it in its optimized access to constants…[/QUOTE]

First of all why should it be inefficent?
Im not talking about to held it there.

Ok i try to explain it again:
user can edit it in the admin backend (and store it automatically after each entry in the database)

once he is done and saves it - we write a new language file - presortet
with the data of the database. - so the lang files becomes some sort of cache file.

how easy or not easy it is to change the text is a thing of the user interface not a thing of databases.

and i totally dissagree: those standard text is often needed by an enduser who have or want to change some of them from time to time for whatever reason.
but for him its usally accessing by ftp retrive the file

edit it (hopefully he dont break the code) and reupload.

sorry but languagefiles like that are methods from the past - also mixing data storend in files with data in an database isnt a clean way.

of course instead of writing a new langfile fetching from the database might be an option too but not nessesarly.

i think:
user must be able to change any content with their backend. if he have to edit files directly its just a desaster waiting to happen.
the way i suggest would solve that issue without changing anything - its just an addon - could be a module to - all you need is write access to the langfile

btw: same appropach they use with the css files and the std. template - look and feel.

Souleater · July 17, 2012, 9:31am

Any news about this? Seems they still use the normal arrays and not php constants and they have still in map.php many double entries for the same constant. Still confused why there is a map.php when it just references multiples times to the same constant in lang.php. Why not directly in lang.php?