How To Download A Url In Python And Save It To Json File
JSON is a common standard used by websites and APIs, and even natively supported by modern databases such as PostgreSQL. In this article, we'll present a tutorial on how to handle JSON data with Python. At present, let's start with the definition of JSON.
Navigation:
- What is JSON?
- JSON in Python
- Converting JSON cord to Python object
- Converting JSON file to Python object
- Converting Python object to JSON string
- Writing Python object to a JSON file
- Converting custom Python objects to JSON objects
- Creating Python class objects from JSON objects
- Loading vs dumping
- Conclusion
What is JSON?
JSON, or JavaScript Object Notation, is a format that uses text to store data objects. In other words, it is information structures for representing objects as text. Even though it's derived from JavaScript, it has go a de facto standard of transferring objects.
This format is supported by nigh popular programming languages, including Python. Most commonly, JSON is used to transfer information objects past APIs. Here is an example of a JSON string:
{ "proper noun": "United states", "population": 331002651, "capital letter": "Washington D.C.", "languages": [ "English language", "Spanish" ] }
In this instance, JSON information looks like a Python dictionary. Just similar dictionaries, JSON contains data in cardinal-value pairs. All the same, the JSON data tin can as well exist a string, a number, a boolean, or a listing.
Before JSON became popular, XML had been the mutual pick to correspond data objects in a text format. Here is an case of the same data in the XML format:
<?xml version="1.0" encoding="UTF-8"?> <land> <name>Us</proper name> <population>331002651</population> <capital>Washington D.C.</capital> <languages> <language>English</linguistic communication> <language>Spanish</linguistic communication> </languages> </country>
As axiomatic hither, JSON is lightweight. This is ane of the primary reasons why JSON is and then pop. If yous want to read more about the JSON standard, head over to the official JSON website.
JSON in Python
Python supports JSON data natively. The Python json
module is part of the Standard Library. The json
module can handle the conversion of JSON information from JSON format to the equivalent Python objects such as dictionary
and list
. The JSON package tin also convert Python objects into the JSON format.
The json
module provides the functionality to write custom encoders and decoders, and at that place is no separate installation needed. Y'all can find the official documentation for the Python JSON module hither.
In the remainder of this tutorial, we will explore this parcel. We're going to convert JSON to dictionary
and list
and the other way round. We'll besides explore how to handle custom classes.
Converting JSON cord to Python object
JSON data is frequently stored in strings. This is a common scenario when working with APIs. The JSON data would be stored in string variables before it can exist parsed. As a effect, the virtually common task related to JSON is to parse the JSON string into the Python lexicon. The JSON module tin take care of this task easily.
The starting time step would be importing the Python json
module. This module contains two important functions – loads
and load
.
Note that the showtime method looks like a plural form, only it is not. The letter of the alphabet 'South' stands for 'cord'.
The helpful method to parse JSON data from strings is loads
. Note that it is read every bit 'load-due south'. The 's' stands for 'string' hither. The other method load
is used when the data is in bytes. This is covered at length in a subsequently department.
Let's start with a simple example. The instance of JSON data is as follows:
{ "name": "United States", "population": 331002651, }
JSON data can be stored as JSON cord earlier it is parsed. Even though we can use Python's triple quotes convention to store multi-line strings, we can remove the line breaks for readability.
# JSON string country = '{"name": "United states of america", "population": 331002651}' impress(type(country))
The output of this snippet volition confirm that this is indeed a JSON cord:
<class 'str'>
We can telephone call the json.loads()
method and provide this string equally a parameter.
import json state = '{"name": "United States", "population": 331002651}' country_dict = json.loads(state) print(type(country)) impress(blazon(country_dict))
The output of this snippet will confirm that the JSON data, which was a cord, is now a Python dictionary.
<class 'str'> <course 'dict'>
This lexicon tin can exist accessed as usual:
impress(country_dict['name']) # OUTPUT: United states
It is important to note hither that the json.loads()
method volition not always render a lexicon. The data type that is returned will depend on the input cord. For example, this JSON string will return a list, not a lexicon.
countries = '["United states", "Canada"]' counties_list= json.loads(countries) impress(blazon(counties_list)) # OUTPUT: <class 'list'>
Similarly, if the JSON cord contains truthful
, it will be converted to Python equivalent boolean value, which is True
.
import json bool_string = 'true' bool_type = json.loads(bool_string) impress(bool_type) # OUTPUT: True
The following tabular array shows JSON objects and the Python data types after conversion. For more details, encounter Python docs.
JSON | Python |
object | dict |
array | list |
string | str |
number (integer) | int |
number (real) | float |
truthful | True |
false | False |
null | None |
Now, allow's move on to the next topic on parsing a JSON object to a Python object.
Converting JSON file to Python object
Reading JSON files to parse JSON data into Python information is very similar to how we parse the JSON information stored in strings. Autonomously from JSON, Python's native open()
function will also exist required.
Instead of the JSON loads
method, which reads JSON strings, the method used to read JSON data in files is load()
.
The load()
method takes up a file object and returns the JSON data parsed into a Python object.
To get the file object from a file path, Python'southward open up()
part can exist used.
Save the following JSON data every bit a new file and proper name it united_states.json
:
{ "name": "United States", "population": 331002651, "capital": "Washington D.C.", "languages": [ "English", "Castilian" ] }
Enter this Python script in a new file:
import json with open('united_states.json') as f: information = json.load(f) impress(type(data))
Running this Python file prints the following:
<form 'dict'>
In this example, the open
part returns a file handle, which is supplied to the load
method.
This variable data
contains the JSON as a Python dictionary. This ways that the dictionary keys can be checked as follows:
print(data.keys()) # OUTPUT: dict_keys(['name', 'population', 'capital', 'languages'])
Using this data, the value of name
can be printed as follows:
data['name'] # OUTPUT: United states of america
In the previous 2 sections, we examined how JSON can be converted to Python objects. Now, it'due south time to explore how to convert Python objects to JSON.
Converting Python object to JSON string
Converting Python objects to JSON objects is also known equally serialization or JSON encoding. Information technology can exist achieved by using the function dumps()
. It is read as dump-s
and the letter Southward
stands for string.
Here is a simple case. Save this code in a new file as a Python script:
import json languages = ["English","French"] country = { "proper name": "Canada", "population": 37742154, "languages": languages, "president": None, } country_string = json.dumps(country) print(country_string)
When this file is run with Python, the following output is printed:
{"name": "Canada", "population": 37742154, "languages": ["English", "French"], "president": nil}
The Python object is now a JSON object. This simple example demonstrates how easy information technology is to parse a Python object to a JSON object. Notation that the Python object was a dictionary. That's the reason it was converted into a JSON object blazon. Lists tin exist converted to JSON likewise. Here is the Python script and its output:
import json languages = ["English", "French"] languages_string = json.dumps(languages) impress(languages_string) # OUTPUT: ["English", "French"]
It's non just limited to a dictionary and a list. string
, int
, float
, bool
and fifty-fifty None
value tin can be converted to JSON.
Refer to the conversion table below for details. As you tin can see, simply the dictionary is converted to json object type. For the official documentation, encounter this link.
Python | JSON |
dict | object |
listing, tuple | array |
str | string |
int, float, int | number |
True | true |
False | fake |
None | null |
Writing Python object to a JSON file
The method used to write a JSON file is dump()
. This method is very similar to the method dumps()
. The just difference is that while dumps()
returns a string, dump()
writes to a file.
Here is a simple demonstration. This will open the file in writing mode and write the data in JSON format. Relieve this Python script in a file and run it.
import json # Tuple is encoded to JSON array. languages = ("English language", "French") # Lexicon is encoded to JSON object. country = { "proper name": "Canada", "population": 37742154, "languages": languages, "president": None, } with open up('countries_exported.json', 'w') every bit f: json.dump(land, f)
When this code is executed using Python, countries_exported.json
file is created (or overwritten) and the contents are the JSON.
However, you volition notice that the entire JSON is in ane line. To make it more readable, nosotros can laissez passer one more parameter to the dump()
part equally follows:
json.dump(state, f, indent=4)
This time when you run the code, it will be nicely formatted with indentation of 4 spaces:
{ "languages": [ "English", "French" ], "president": null, "proper name": "Canada", "population": 37742154 }
Annotation that this indent
parameter is also available for JSON dumps()
method. The just difference between the signatures of JSON dump()
and JSON dumps()
is that dump()
needs a file object.
Converting custom Python objects to JSON objects
Let'southward examine the signature of dump()
method:
dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True,allow_nan=True, cls=None, indent=None, separators=None,default=None, sort_keys=Fake, **kw)
Let's focus on the parameter cls
.
If no Class
is supplied while calling the dump
method, both the dump()
and dumps()
methods default to the JSONEncoder
course. This class supports the standard Python types: dict
, list
, tuple
, str
, int
, bladder
, Truthful
, False
, and None
.
If we try to call the json.loads()
method on whatever other type, this method will raise a TypeError
with a message: Object of blazon <your_type> is not JSON serializable
.
Save the following code as a Python script and run it:
import json class Country: def __init__(self, proper noun, population, languages): cocky.name = proper noun self.population = population cocky.languages = languages canada = Country("Canada", 37742154, ["English language", "French"]) print(json.dumps(canada)) # OUTPUT: TypeError: Object of type Country is not JSON serializable
To convert the objects to JSON, we need to write a new class that extends JSONEncoder. In this class, the method default()
should be implemented. This method will have the custom code to return the JSON.
Hither is the instance Encoder for the Country
grade. This class will help converting a Python object to a JSON object:
import json form CountryEncoder(json.JSONEncoder): def default(cocky, o): if isinstance(o, Country): # JSON object would be a lexicon. return { "proper name" : o.proper name, "population": o.population, "languages": o.languages } else: # Base of operations class will raise the TypeError. return super().default(o)
This code is just returning a dictionary, after confirming that the supplied object is an instance of Country
class, or calling the parent to handle the residuum of the cases.
This class can now be supplied to the json.dump()
as well as json.dumps()
methods.
impress(json.dumps(canada, cls=CountryEncoder)) # OUTPUT: {"name": "Canada", "population": 37742154, "languages": ["English", "French"]}
Creating Python grade objects from JSON objects
Then far, we have discussed how json.load()
and json.loads()
methods tin create a lexicon, list, and more. What if nosotros want to read a JSON object and create a custom class object?
In this section, we will create a custom JSON Decoder that will assist u.s. create custom objects. This custom decoder volition let u.s. to use the json.load()
and json.loads()
methods, which will return a custom class object.
We will piece of work with the same Land
class that we used in the previous section. Using a custom encoder, we were able to write code like this:
# Create an object of course Country canada = Country("Canada", 37742154, ["English language", "French"]) # Utilise json.dump() to create a JSON file in writing mode with open('canada.json','w') as f: json.dump(canada,f, cls=CountryEncoder)
If we try to parse this JSON file using the json.load()
method, nosotros will get a dictionary:
with open up('canada.json','r') as f: country_object = json.load(f) # OUTPUT: <type 'dict'>
To get an case of the Land
class instead of a lexicon, we need to create a custom decoder. This decoder class will extend JSONDecoder. In this class, we volition be writing a method that will be object_hook
. In this method, nosotros will create the object of Country
grade by reading the values from the dictionary.
Apart from writing this method, nosotros would too need to call the __init__
method of the base course and set up the value of the parameter object_hook
to this method proper name. For simplicity, nosotros can utilise the same name.
import json class CountryDecoder(json.JSONDecoder): def __init__(self, object_hook=None, *args, **kwargs): super().__init__(object_hook=cocky.object_hook, *args, **kwargs) def object_hook(cocky, o): decoded_country = Country( o.get('proper noun'), o.become('population'), o.get('languages'), ) return decoded_country
Notation that nosotros are using the .get()
method to read dictionary keys. This will ensure that no errors are raised if a key is missing from the dictionary.
Finally, we can telephone call the json.load()
method and set the cls
parameter to CountryDecoder
class.
with open('canada.json','r') as f: country_object = json.load(f, cls=CountryDecoder) impress(type(country_object)) # OUTPUT: <class 'Country'>
That's it! Nosotros now have a custom object created directly from JSON.
Loading vs dumping
The Python JSON module has 4 key functions: read()
, reads()
, load()
, and loads()
. It often becomes confusing to remember these functions. The most important affair to remember is that the alphabetic character 'S' stands for Cord
. Likewise, read the letter 's' separately in the functions loads()
and dumps()
, that is, read loads
every bit load-s
and read dumps()
as dump-south
.
Here is a quick tabular array to aid you think these functions:
File | String | |
Read | load() | loads() |
Write | dump() | dumps() |
Conclusion
In this tutorial, we explored reading and writing JSON information using Python. Knowing how to piece of work with JSON data is essential, particularly when working with websites. JSON is used to transfer and store data everywhere, including APIs, web scrapers, and modern databases similar PostgreSQL.
Understanding JSON is crucial if yous are working on a web scraping project that involves dynamic websites. Caput over to our blog post for a practical case of JSON being useful for pages with infinite gyre.
Source: https://oxylabs.io/blog/python-parse-json
Posted by: bergerontatied.blogspot.com
0 Response to "How To Download A Url In Python And Save It To Json File"
Post a Comment