Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for sparkmeasure #5202

Merged

Conversation

sjgllgh
Copy link
Contributor

@sjgllgh sjgllgh commented Nov 14, 2024

What is the purpose of the change

Add support for SparkMeasure to the Spark engine for better monitoring of Spark's performance.

Related issues/PRs

Related issues: #5200

How to use

  1. Here is an example of submitting parameters using RESTful API, with the specific parameters listed below:
    { "executionContent": { "code": "select * from test1.test1", "runType": "sql" }, "labels": { "engineType": "spark-3.2.1", "userCreator": "zhangyuyao-IDE" }, "params": { "configuration": { "runtime": { "linkis.sparkmeasure.aggregate.type": "stage" }, "startup": { "linkis.sparkmeasure.flight.recorder.type": "task" } } } }
  2. It is only valid for some DQLs, such as SELECT, INSERT, and CREATE AS SELECT.
  3. Indicators can be categorized into aggregate indicators and detailed indicators, and each type of indicator can further be divided into two sub-categories: stage and task.
  4. For aggregate indicators (using the parameter: linkis.sparkmeasure.aggregate.type), each eligible SQL query will have a separate file output.
  5. For detailed indicators (using the parameter: linkis.sparkmeasure.flight.recorder.type), an engine will only output one file, and the file will be generated when the engine is shut down. It is recommended to only execute one SQL statement per engine when using this feature.
  6. Due to the reuse mechanism of the Linkis engine, using linkis.sparkmeasure.flight.recorder.type does not necessarily result in the creation of a new engine. If an existing engine is reused, it may lead to no indicator file being output.
  7. To ensure that the file is output to the correct path, you should add the following parameter to the Spark engine: linkis.sparkmeasure.output.prefix.
  8. Currently, support is provided for writing files to both local and HDFS storage.
  9. For a detailed introduction to sparkmeasure, please refer to SparkMeasure

Note

  1. The files generated by SparkMeasure Flight Recorder can be relatively large.

Checklist

  • I have read the Contributing Guidelines on pull requests.
  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the Linkis mailing list first)
  • If this is a code change: I have written unit tests to fully verify the new behavior.

Copy link
Contributor

@peacewong peacewong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@peacewong peacewong merged commit 9b5a34c into apache:master Nov 26, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants