Skip to content

Developer Guidelines

Dhaval Salwala edited this page Jun 24, 2023 · 2 revisions

Adhere to appropriate coding guidelines:

https://google.github.io/styleguide/pyguide.html

Please try to use type hints

# this helps IDEs and other static type checker utils spot errors before they happen during runtime
from pyspark.sql import DataFraem
data_frame: DataFrame = ...
# as opposed to this which makes it harder for these tools to do this
data_frame = ...

More examples:

Here we fully annotate with type hints both input parameter types in the function return type. Note that the typing module is used for non-primitive type.

from typing import Tuple
def auth_and_service_endpoints(
    coscreds: dict, location: str = "us-south", region: str = "us", public: bool = True,
) -> Tuple[str]:
  # [code removed for brevity]
  return auth_endpoint, service_endpoint

pylint your code!

Of course no linter is perfect (as noted in google's style guide). However, with diligent maintenance of our .pylintrc file we can attenuate most of the annoying false positives.

If you use the docker container and visual studio code, pylint should be enabled automatically via this line. BTW, please do not edit these project-level settings. They can be overwritten if you really do not like something in your $HOME/.vscode/settings.json file.

Consider using Visual Studio Code

Although the choice of a programming environment borders on a religious right, there can be some value (to the team) in following the official doctrine (this coming from an rabid emacs user no less). Most of the developers on the project use Visual Studio code because it offers a number of very useful things:

  • It's completely free and cross platform (Mac, Linux, and Windows);
  • It has excellent support for just about any language out there including Java (take note you curmudgeon eclipse users) with excellent maven integration;
  • It has a great number of very useful extensions like automatic pylinting..hint hint or working with containers running locally or remotely a breeze;
  • You can work with remote code bases via ssh in a seamless manner (e.g., edit and debug just as everything was on your local machine);
  • It has a means to standardize certain project settings (like code formatting on save) across users