Polars is a Python (and Rust) library for working with tabular data, similar to Pandas, but with high performance, optimized queries, and support for larger-than-RAM datasets. It has a powerful API, supports lazy and eager execution, and leverages multi-core processors and SIMD instructions for efficient data processing.
Gigasheets is a popular no-code data transformation tool, that lets you merge, explore, and transform large datasets without the need for technical coding. We show you how to easily bring data into Einblick for analysis/visualization, and write data to Gigasheet from Einblick.
pandasql is a Python library that allows you to use SQL syntax to work with data in the pandas DataFrame structure. It provides simple access to powerful data analysis functionality and makes it easier for users familiar with SQL syntax to utilize the power of pandas.
In this post, I’ll explore the Python Faker library, which can be used for generating fake data. I’ll cover the profile provider and how to customize it, as well as the DynamicProvider class for further customization. Read on to learn more.
BeautifulSoup is a Python package designed for parsing HTML and turning the markup code into something navigable and searchable. Easy scraping can improve your life tremendously: here, I was using it to assemble a list of on-sale wines at my local wine store. We also use the Requests package to grab the URL (taking bets on when requests going to be baked in).
Snowpark allows developers to use familiar languages and coding styles to run code directly on Snowflake compute.
OpenAI has released a powerful API to use with their pre-trained models. This includes generative AI solutions like text completion and natural language, without the need to train models locally or work with heavyweight machines. This canvas example is designed to show you how to get started in Python.
Getting Twitter data into your Python analysis is easy with the use of the Tweepy API. In this Tools post, we cover the crash course on how to find tweets related to a given hashtag, and pull it in (and how to do a quick sentiment analysis).
Use pydeck to bring the power of Uber’s open source deck.gl to Python and create stunning map visualizations.
Use the Pushshift API and Reddit API in order to create novel datasets pulling Reddit data into Python data frames. Easily transition to NLP and ML analysis of the Reddit data sets as well.
Let's face it: the Snowflake web uploader is painful to use. Here's my script to take a CSV or the results of a Python notebook, and write it to your Snowflake database.
Combine the data wrangling power of the Python ecosystem and the map visualization strengths of the leaflet.js library through folium.
Define the max, min, and dimensions of the table to generate, and create a Pandas dataframe with random values inside.