
URL. Acronym in English for U niform R esource L ocator, which on abbreviationfinder.org is Uniform Resource Locator. It is a string of characters with which a unique address is assigned to each of the information resources available on the Internet. There is a unique URL for every page in every document on the World Wide Web, for every Gopher item and every USENET discussion group, and so on.
History
Uniform resource locators were a fundamental innovation in the history of the Internet. They were first used by Tim Berners-Lee in 1991 to allow document authors to establish hyperlinks on the World Wide Web. Since 1994, in Internet standards, the concept of URL has been incorporated into the more general URI (uniform resource identifier), but the term URL is still widely used.
Although they were never mentioned as such in any standard, many people believe that the initials URL stand for universal resource locator. This interpretation may be due to the fact that, although the U in URL has always meant “uniform”, the U in URI originally meant “universal”, prior to the publication of RFC 2396.
The URL of an information resource is its address on the Internet, which allows the browser to find and display it appropriately. Therefore, the URL combines the name of the computer that provides the information, the directory where it is located, the name of the file, and the protocol to use to retrieve the data.
Definition
The general format of a URL is:
schema: // machine / directory / file
Other data can also be added:
schema: // user: password @ machine: port / directory / file
URL scheme
A URL is classified by its scheme, which generally indicates the protocol of network that is used to recover, through the network, information of the identified resource. A URL begins with the name of your schema, followed by a colon, followed by a specific part of the schema ‘.
Some examples of URL schemes
- http – (HyperText Transport Protocol), is the protocol used to transmit Hypertext. All HTML pages on WWW servers must be referenced through this service. It will indicate connection to a WWW Server.
- https – (HyperText Transport Protocol Secure), is the protocol for connecting to secure WWW servers. These servers are normally commercial and use encryption to prevent the interception of sent data, usually credit card numbers, personal data, etc., it will make a connection to a secure WWW server.
- Ftp – (File Transfer Protocol), it will use the FTP file transfer protocol. It will be used when the information to be accessed is on an ftp server. By default you will access an anonymous server (anonymous), if you want to indicate the username you will use: ftp: //[email protected], and then it will ask for the password.
- Mailto – It will be used to send Email, all browsers are not capable. In this case, only the destination email address will be indicated: mailto: // alias. mail @ domain
- LDAP – LDAP Lightweight Directory Access Protocol lookups
- File – resources available on the local system, or on a local network
- News – Access the news service, for this the WWW viewer must be able to present this service, not all of them are. The news server will be indicated and as the path the newsgroup to which you want to access
- Gopher – the Gopher protocol (deprecated)
- Telnet – Remote terminal emulation, to connect to a multi-user machine, it is used to access public accounts such as the library. The normal thing is to call an external application to make the connection. In this case, the machine and the login will be indicated.
- Data – the scheme for inserting small pieces of content into Data documents: URL
Some of the URL schemes, such as the popular “mailto”, “http”, “ftp”, and “file”, along with general URL syntax, were first detailed in 1994, in the Request for Comments RFC 1630, replaced a year later by the more specific RFC 1738 and RFC 1808.
Some of the schemas defined in the first RFC are still valid, while others are debated or have been refined by later standards. Meanwhile, the definition of the general syntax of URLs has been split into two separate lines of URI specification: RFC 2396 (1998) and RFC 2732 (1999), both now obsolete but still widely referenced in URL scheme definitions.
The current standard is STD 66 / RFC 3986 (2005).
Generic URL syntax
All URLs, regardless of schema, must follow a general syntax. Each scheme can determine its own syntax requirements for its specific part, but the full URL must follow the general syntax.
Using a limited set of characters, compatible with the printable subset of ASCII, the generic syntax allows URLs to represent the address of a resource, regardless of the original shape of the address components.
Schemes that use typical connection-based protocols use a common syntax for ” generic URIs ” defined below:
schema: // authority / path? query # fragment
The authority usually consists of the name or IP Address of a Server, sometimes followed by a colon (“:”) and a TCP Port number. You can also include a username and password to authenticate to the server.
The route is the specification of a location in a hierarchical structure, using a slash (“/”) as delimiter between components.
The query usually indicates parameters of a dynamic query to some Database or resident process on the server.
The snippet identifies a portion of a resource, usually a location in a Document.
Case-sensitive
According to the current standard, the schema and host components are not case-sensitive, and when normalized during processing, they must be lowercase. It must be assumed that there is differentiation in other components. However, in practice, in other components apart from the protocol and host, this differentiation is dependent on the web server and the operating system of the system that hosts the server.
URL in daily use
An HTTP URL combines into one simple address the four basic pieces of information necessary to retrieve a resource from anywhere on the Internet:
- The protocol used to communicate,
- The host (server) you communicate with,
- The network port on the server to connect,
- The path to the resource on the server (for example, its file name).
Many web browsers do not require the user to enter “http: //” to go to a web page, since HTTP is the most common protocol used in web browsers. Likewise, since 80 is the default port for HTTP, it is usually not specified.
Since the HTTP protocol allows a server to respond to a request by redirecting the web browser to a different URL, many servers additionally allow users to skip certain parts of the URL, such as the “www.”, or the trace character (“#”) if the resource in question is a directory. However, these omissions technically constitute a different URL, so the web browser cannot make these adjustments, and has to trust that the server will respond with a redirect. It is possible for a web server (but due to a strange tradition) to offer two different pages for URLs that differ only by one “#” character.