Skip to content

Added Languages to building-extensions. #487

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ceford
Copy link
Collaborator

@ceford ceford commented Jul 2, 2025

User description

In this section I added an index.md file as a placeholder to introduce documentation of language packs as extensions and two articles based on my own experience. They may be too personal for the Manual! I would not be offended if others considered them inappropriate for the Programmers Documentation.


PR Type

Documentation


Description

  • Added comprehensive language pack documentation section

  • Included Scottish Gaelic extension example with complete structure

  • Provided OpenAI translation automation script and methodology

  • Documented language pack creation, installation, and build processes


Changes diagram

flowchart LR
  A["Language Pack Documentation"] --> B["Extension Structure"]
  A --> C["Translation Methods"]
  B --> D["Package Files"]
  B --> E["Client Configurations"]
  C --> F["OpenAI Translation"]
  C --> G["Build Process"]
Loading

Changes walkthrough 📝

Relevant files
Documentation
index.md
Language pack documentation overview                                         

docs/building-extensions/languages/index.md

  • Created language pack documentation index page
  • Explained language pack structure and components
  • Listed different types of language documentation needs
  • Added references to existing translation resources
  • +29/-0   
    language-extension-example.md
    Complete language extension implementation example             

    docs/building-extensions/languages/language-extension-example.md

  • Comprehensive Scottish Gaelic language extension example
  • Detailed file structure and XML configuration examples
  • Installation scripts and metadata specifications
  • Build process using Phing and development workflow
  • +423/-0 
    translate-openai.md
    OpenAI automated translation implementation guide               

    docs/building-extensions/languages/translate-openai.md

  • PHP script for automated translation using OpenAI API
  • Batch processing methodology for .ini files
  • Complete implementation with error handling
  • Cost-effective translation approach documentation
  • +322/-0 

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Copy link
    Contributor

    qodo-merge-pro bot commented Jul 2, 2025

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 No relevant tests
    🔒 Security concerns

    API key exposure:
    The translation script contains a placeholder for an OpenAI API key that could accidentally be committed with real credentials. Additionally, the script disables SSL verification (CURLOPT_SSL_VERIFYHOST and CURLOPT_SSL_VERIFYPEER set to false) which creates a security vulnerability by allowing man-in-the-middle attacks during API communications.

    ⚡ Recommended focus areas for review

    Security Risk

    The script contains hardcoded API key placeholder and uses curl with SSL verification disabled, which could lead to security vulnerabilities if used as-is in production environments.

    private static $open_ai_key = 'your_api_key_goes_here';
    
    /**
     * URL of current version of the API endpoint.
     */
    private static $open_ai_url = 'https://api.openai.com/v1';
    
    /**
     * Process one of the client ini folders
     *
     * @param   string  $folder The name of the folder: api, admin or site
     *
     * @return void
     */
    public function go($folder) {
    
        switch ($folder) {
            case 'api':
                $source = "api/language/en-GB/";
                $sink   = "/Users/ceford/git/cefjdemos-pkg-gd-gb/gd-GB/api_gd-GB/";
                break;
            case 'admin':
                $source = "administrator/language/en-GB/";
                $sink   = "/Users/ceford/git/cefjdemos-pkg-gd-gb/gd-GB/admin_gd-GB";
                break;
            case 'site':
                $source  = "language/en-GB/";
                $sink   = "/Users/ceford/git/cefjdemos-pkg-gd-gb/gd-GB/site_gd-GB/";
                break;
            default:
                die("unkown folder: {$folder}\n");
        }
    
        // Read in the list of source files.
        $files = file_get_contents(__DIR__ . "/{$folder}.txt");
        $lines = explode(PHP_EOL, $files);
        $count = 0;
        $pattern = '/(.*)"(.*)"/';
        foreach ($lines as $line) {
            if (empty(trim($line))) {
                continue;
            }
    
            // If the translation has been done, skip this file.
            if (is_file($sink . $line)) {
                continue;
            }
    
            // Create an empty file.
            file_put_contents($sink . $line, "");
    
            // Read in the English ini file.
            $inifile = file_get_contents($this->base . $source . $line);
            echo "Processing {$source}{$line}\n";
            $inilines = explode(PHP_EOL, $inifile);
            $inicount = 0;
            $batch = [];
            foreach ($inilines as $iniline) {
                $test = preg_match($pattern, $iniline, $matches);
    
                if (!empty($test)) {
                    // The key is in $matches[1] and the value in $matches[2]
                    $keys[$inicount] = $matches[1];
    
                   // Add the whole line to the batch
                    $batch[] = $matches[0];
                    $inicount += 1;
                    // If the batch is a multiple of 25 send it for translation.
                    if ($inicount % 25 === 0) {
                        file_put_contents($sink . $line, $this->translateme($batch), FILE_APPEND);
                        $batch = [];
                    }
                } else {
                    // Output any pending batch translations.
                    if (!empty($batch)) {
                        file_put_contents($sink . $line, $this->translateme($batch), FILE_APPEND);
                    }
    
                   // Output the line unchanged
                    file_put_contents($sink . $line, "{$iniline}\n", FILE_APPEND);
    
                   $batch = [];
                }
            }
    
            // Translate any lines still in the batch;
            if (!empty($batch)) {
                file_put_contents($sink . $line, $this->translateme($batch), FILE_APPEND);
            }
            $count += 1;
        }
    
        echo "Total = {$count}\n\n";
    }
    
    /**
     * Prepare a batch of lines for translation
     *
     * @param array $batch The array of lines so far.
     */
    protected function translateme($batch) {
        $text = implode("\n", $batch);
    
        // submit a batch of lines to openai.com for translation.
        $translation = $this->getTranslation('Scottish Gaelic', $text);
    
        return "{$translation}\n";
    }
    
    /**
     * Compose the message to be sent to openai.com
     *
     * @param   string  $language_name      The name of the destination language in English
     * @param   string  $paragraphBuffer    The text to be translated.
     *
     * @return  string  The translated text or the original text with comments.
     */
    protected function getTranslation($language_name, $paragraphBuffer) {
        $instruction = "Please translate the following ini file text from English to {$language_name}";
        if ($language_name == 'German') {
            $instruction .= ' Please use the word Beiträge rather than Artikel. ';
        }
    
        $messages = [
            [
                "role" => "system",
                "content" => "You are a translator who translates text from English to {$language_name}. " .
                "Provide only the translated text, without any comments or explanations. " .
                "The text is in ini file format with a key followed by the value to be translated in double quotes" .
                "The translated value must be on one line."
            ],
            [
            'role' => 'user',
            'content' => $instruction . ": \n" .
            $paragraphBuffer,
            ],
        ];
    
        $return = $this->chat($messages);
        if (empty($return['choices'])) {
            // Find out what is going on!
            //var_dump($return, $messages);
            echo "Untranslated text: " . substr($paragraphBuffer, 0, 64) . "\n";
            return "<!-- untranslated -->\n{$paragraphBuffer}\n<!-- enduntranslated -->\n";
        } else {
            return $return['choices'][0]['message']['content'];
        }
    }
    
    /**
     * Set the openai parameters and create a message: https://platform.openai.com/docs/api-reference/chat/create
     *
     * @param array $messages (each item must have "role" and "content" elements, this is the whole conversation)
     * @param int $maxTokens maximum tokens for the response in ChatGPT (1000 is the limit for gpt-3.5-turbo)
     * @param string $model valid options are "gpt-3.5-turbo", "gpt-4", and in the future probably "gpt-5"
     * @param int $responseVariants how many response to come up with (normally we just want one)
     * @param float $frequencyPenalty between -2.0 and 2.0, penalize new tokens based on their existing frequency in the answer
     * @param int $presencePenalty between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the conversation so far, increasing the AI's chances to write on new topics.
     * @param int $temperature default is 1, between 0 and 2, higher value makes the model more random in its discussion (going on tangents).
     * @param string $user if you have distinct app users, you can send a user ID here, and OpenAI will look to prevent common abuses or attacks
     */
    protected function chat(
        $messages = [],
        $maxTokens=2000,
        $model='gpt-4o',
        $responseVariants=1,
        $frequencyPenalty=0,
        $presencePenalty=0,
        $temperature=1,
        $user='') {
    
        //create message to post
        $message = new stdClass();
        $message -> messages = $messages;
        $message -> model = $model;
        $message -> n = $responseVariants;
        $message -> frequency_penalty = $frequencyPenalty;
        $message -> presence_penalty = $presencePenalty;
        $message -> temperature = $temperature;
    
        if($user) {
            $message -> user = $user;
        }
    
        $result = self::_sendMessage('/chat/completions', data: json_encode($message));
    
        return $result;
    }
    
    /**
     * Send the request message to openai.
     *
     * @param string $endpoint  Endpoint obtained from the openai url
     * @param string $data      The json encoded data to be sent.
     * @param string $method    Deafults to post.
     *
     * @return object The response to the request.
     */
    private static function _sendMessage($endpoint, $data = '', $method = 'post') {
        $apiEndpoint = self::$open_ai_url.$endpoint;
    
        $curl = curl_init();
    
        if($method == 'post') {
            $params = array(
                CURLOPT_URL => $apiEndpoint,
                CURLOPT_SSL_VERIFYHOST => false,
                CURLOPT_SSL_VERIFYPEER => false,
    Hardcoded Paths

    The build configuration and script contain hardcoded absolute file paths that are specific to the author's development environment, making the examples less portable and potentially confusing for other developers.

    	<authorUrl>https://github.com/ceford/cefjdemos-pkg-gd-gb</authorUrl>
    	<copyright>(C) 2025 Clifford E Ford. All rights reserved.</copyright>
    	<license>GNU General Public License version 2 or later; see LICENSE.txt</license>
    	<url>https://github.com/ceford/cefjdemos-pkg-gd-gb</url>
    	<packager>Clifford E Ford</packager>
    	<packagerurl>https://github.com/ceford/cefjdemos-pkg-gd-gb</packagerurl>
    	<description><![CDATA[Scottish Gaelic translation created by openai.com]]></description>
    	<blockChildUninstall>true</blockChildUninstall>
    	<scriptfile>script.php</scriptfile>
    	<files>
    		<file type="language" client="site" id="gd-GB">site_gd-GB.zip</file>
    		<file type="language" client="administrator" id="gd-GB">admin_gd-GB.zip</file>
    		<file type="language" client="api" id="gd-GB">api_gd-GB.zip</file>
    	</files>
    	<updateservers>
    		<server type="extension" priority="2" name="Scottish Gaelic Update Site">https://github.com/ceford/cefjdemos-pkg-gd-gb/raw/main/pkg_gd-GB.zip</server>
    	</updateservers>
    </extension>

    The extension version is usually the same as the Joomla version for which it was created. An optional extra parameter may be used for updates, for example 5.3.1.1. When creating a third party extension take care not to copy any Official Joomla! elements. The JED Checker will flag some as invalid.

    The script.php file

    This file is used to perform additional changes during extension install, update or uninstall. It is stored in the administrator/manifests/packages/gd-GB folder.

    <?php
    /**
     * @package    Joomla.Language

    Copy link
    Contributor

    qodo-merge-pro bot commented Jul 2, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Security
    Use environment variables for API key

    The API key is hardcoded as a placeholder string which will cause authentication
    failures. Consider using environment variables or configuration files to
    securely store the API key instead of hardcoding it in the source code.

    docs/building-extensions/languages/translate-openai.md [66]

    -private static $open_ai_key = 'your_api_key_goes_here';
    +private static $open_ai_key;
     
    +public function __construct() {
    +    self::$open_ai_key = $_ENV['OPENAI_API_KEY'] ?? getenv('OPENAI_API_KEY');
    +    if (empty(self::$open_ai_key)) {
    +        throw new Exception('OpenAI API key not found in environment variables');
    +    }
    +}
    +
    • Apply / Chat
    Suggestion importance[1-10]: 7

    __

    Why: This is a valid security best practice, but the code is an example script in documentation with a clear placeholder your_api_key_goes_here, which mitigates the immediate risk.

    Medium
    General
    Improve error handling and logging

    The error handling silently returns untranslated content without proper logging
    or retry mechanisms. This could lead to incomplete translations going unnoticed.
    Add proper error logging and consider implementing retry logic for failed API
    calls.

    docs/building-extensions/languages/translate-openai.md [205-210]

     if (empty($return['choices'])) {
    -    // Find out what is going on!
    -    //var_dump($return, $messages);
    -    echo "Untranslated text: " . substr($paragraphBuffer, 0, 64) . "\n";
    -    return "<!-- untranslated -->\n{$paragraphBuffer}\n<!-- enduntranslated -->\n";
    +    $errorMsg = "Translation failed for: " . substr($paragraphBuffer, 0, 64);
    +    error_log($errorMsg);
    +    echo $errorMsg . "\n";
    +    
    +    // Consider implementing retry logic here
    +    return "<!-- TRANSLATION_FAILED -->\n{$paragraphBuffer}\n<!-- /TRANSLATION_FAILED -->\n";
     }

    [To ensure code accuracy, apply this suggestion manually]

    Suggestion importance[1-10]: 6

    __

    Why: The suggestion to add formal logging with error_log is a good practice, but the existing code already reports errors to the console, so the improvement to robustness is moderate for this example script.

    Low
    Increase timeout for API reliability

    The 90-second timeout may be insufficient for large translation batches,
    potentially causing incomplete translations. Consider implementing exponential
    backoff or increasing the timeout for better reliability with API rate limits.

    docs/building-extensions/languages/translate-openai.md [276]

    -CURLOPT_TIMEOUT => 90,
    +CURLOPT_TIMEOUT => 180,
    +CURLOPT_CONNECTTIMEOUT => 30,
    • Apply / Chat
    Suggestion importance[1-10]: 5

    __

    Why: Increasing the timeout and adding a connection timeout can improve the script's reliability, but the current value of 90 seconds is already quite generous for the described batch sizes.

    Low
    • More

    @ceford
    Copy link
    Collaborator Author

    ceford commented Jul 2, 2025

    I will await any human feedback before fixing the points raised in the automated review.

    @HLeithner
    Copy link
    Member

    I have many concerns about this PR tbh, and need some time to give feedback.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants