1<?xml version="1.0" encoding="UTF-8"?> 2<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> 4 5<book id="DoingIO"> 6 <bookinfo> 7 <title>Bus-Independent Device Accesses</title> 8 9 <authorgroup> 10 <author> 11 <firstname>Matthew</firstname> 12 <surname>Wilcox</surname> 13 <affiliation> 14 <address> 15 <email>matthew@wil.cx</email> 16 </address> 17 </affiliation> 18 </author> 19 </authorgroup> 20 21 <authorgroup> 22 <author> 23 <firstname>Alan</firstname> 24 <surname>Cox</surname> 25 <affiliation> 26 <address> 27 <email>alan@lxorguk.ukuu.org.uk</email> 28 </address> 29 </affiliation> 30 </author> 31 </authorgroup> 32 33 <copyright> 34 <year>2001</year> 35 <holder>Matthew Wilcox</holder> 36 </copyright> 37 38 <legalnotice> 39 <para> 40 This documentation is free software; you can redistribute 41 it and/or modify it under the terms of the GNU General Public 42 License as published by the Free Software Foundation; either 43 version 2 of the License, or (at your option) any later 44 version. 45 </para> 46 47 <para> 48 This program is distributed in the hope that it will be 49 useful, but WITHOUT ANY WARRANTY; without even the implied 50 warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. 51 See the GNU General Public License for more details. 52 </para> 53 54 <para> 55 You should have received a copy of the GNU General Public 56 License along with this program; if not, write to the Free 57 Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, 58 MA 02111-1307 USA 59 </para> 60 61 <para> 62 For more details see the file COPYING in the source 63 distribution of Linux. 64 </para> 65 </legalnotice> 66 </bookinfo> 67 68<toc></toc> 69 70 <chapter id="intro"> 71 <title>Introduction</title> 72 <para> 73 Linux provides an API which abstracts performing IO across all busses 74 and devices, allowing device drivers to be written independently of 75 bus type. 76 </para> 77 </chapter> 78 79 <chapter id="bugs"> 80 <title>Known Bugs And Assumptions</title> 81 <para> 82 None. 83 </para> 84 </chapter> 85 86 <chapter id="mmio"> 87 <title>Memory Mapped IO</title> 88 <sect1 id="getting_access_to_the_device"> 89 <title>Getting Access to the Device</title> 90 <para> 91 The most widely supported form of IO is memory mapped IO. 92 That is, a part of the CPU's address space is interpreted 93 not as accesses to memory, but as accesses to a device. Some 94 architectures define devices to be at a fixed address, but most 95 have some method of discovering devices. The PCI bus walk is a 96 good example of such a scheme. This document does not cover how 97 to receive such an address, but assumes you are starting with one. 98 Physical addresses are of type unsigned long. 99 </para> 100 101 <para> 102 This address should not be used directly. Instead, to get an 103 address suitable for passing to the accessor functions described 104 below, you should call <function>ioremap</function>. 105 An address suitable for accessing the device will be returned to you. 106 </para> 107 108 <para> 109 After you've finished using the device (say, in your module's 110 exit routine), call <function>iounmap</function> in order to return 111 the address space to the kernel. Most architectures allocate new 112 address space each time you call <function>ioremap</function>, and 113 they can run out unless you call <function>iounmap</function>. 114 </para> 115 </sect1> 116 117 <sect1 id="accessing_the_device"> 118 <title>Accessing the device</title> 119 <para> 120 The part of the interface most used by drivers is reading and 121 writing memory-mapped registers on the device. Linux provides 122 interfaces to read and write 8-bit, 16-bit, 32-bit and 64-bit 123 quantities. Due to a historical accident, these are named byte, 124 word, long and quad accesses. Both read and write accesses are 125 supported; there is no prefetch support at this time. 126 </para> 127 128 <para> 129 The functions are named <function>readb</function>, 130 <function>readw</function>, <function>readl</function>, 131 <function>readq</function>, <function>readb_relaxed</function>, 132 <function>readw_relaxed</function>, <function>readl_relaxed</function>, 133 <function>readq_relaxed</function>, <function>writeb</function>, 134 <function>writew</function>, <function>writel</function> and 135 <function>writeq</function>. 136 </para> 137 138 <para> 139 Some devices (such as framebuffers) would like to use larger 140 transfers than 8 bytes at a time. For these devices, the 141 <function>memcpy_toio</function>, <function>memcpy_fromio</function> 142 and <function>memset_io</function> functions are provided. 143 Do not use memset or memcpy on IO addresses; they 144 are not guaranteed to copy data in order. 145 </para> 146 147 <para> 148 The read and write functions are defined to be ordered. That is the 149 compiler is not permitted to reorder the I/O sequence. When the 150 ordering can be compiler optimised, you can use <function> 151 __readb</function> and friends to indicate the relaxed ordering. Use 152 this with care. 153 </para> 154 155 <para> 156 While the basic functions are defined to be synchronous with respect 157 to each other and ordered with respect to each other the busses the 158 devices sit on may themselves have asynchronicity. In particular many 159 authors are burned by the fact that PCI bus writes are posted 160 asynchronously. A driver author must issue a read from the same 161 device to ensure that writes have occurred in the specific cases the 162 author cares. This kind of property cannot be hidden from driver 163 writers in the API. In some cases, the read used to flush the device 164 may be expected to fail (if the card is resetting, for example). In 165 that case, the read should be done from config space, which is 166 guaranteed to soft-fail if the card doesn't respond. 167 </para> 168 169 <para> 170 The following is an example of flushing a write to a device when 171 the driver would like to ensure the write's effects are visible prior 172 to continuing execution. 173 </para> 174 175<programlisting> 176static inline void 177qla1280_disable_intrs(struct scsi_qla_host *ha) 178{ 179 struct device_reg *reg; 180 181 reg = ha->iobase; 182 /* disable risc and host interrupts */ 183 WRT_REG_WORD(&reg->ictrl, 0); 184 /* 185 * The following read will ensure that the above write 186 * has been received by the device before we return from this 187 * function. 188 */ 189 RD_REG_WORD(&reg->ictrl); 190 ha->flags.ints_enabled = 0; 191} 192</programlisting> 193 194 <para> 195 In addition to write posting, on some large multiprocessing systems 196 (e.g. SGI Challenge, Origin and Altix machines) posted writes won't 197 be strongly ordered coming from different CPUs. Thus it's important 198 to properly protect parts of your driver that do memory-mapped writes 199 with locks and use the <function>mmiowb</function> to make sure they 200 arrive in the order intended. Issuing a regular <function>readX 201 </function> will also ensure write ordering, but should only be used 202 when the driver has to be sure that the write has actually arrived 203 at the device (not that it's simply ordered with respect to other 204 writes), since a full <function>readX</function> is a relatively 205 expensive operation. 206 </para> 207 208 <para> 209 Generally, one should use <function>mmiowb</function> prior to 210 releasing a spinlock that protects regions using <function>writeb 211 </function> or similar functions that aren't surrounded by <function> 212 readb</function> calls, which will ensure ordering and flushing. The 213 following pseudocode illustrates what might occur if write ordering 214 isn't guaranteed via <function>mmiowb</function> or one of the 215 <function>readX</function> functions. 216 </para> 217 218<programlisting> 219CPU A: spin_lock_irqsave(&dev_lock, flags) 220CPU A: ... 221CPU A: writel(newval, ring_ptr); 222CPU A: spin_unlock_irqrestore(&dev_lock, flags) 223 ... 224CPU B: spin_lock_irqsave(&dev_lock, flags) 225CPU B: writel(newval2, ring_ptr); 226CPU B: ... 227CPU B: spin_unlock_irqrestore(&dev_lock, flags) 228</programlisting> 229 230 <para> 231 In the case above, newval2 could be written to ring_ptr before 232 newval. Fixing it is easy though: 233 </para> 234 235<programlisting> 236CPU A: spin_lock_irqsave(&dev_lock, flags) 237CPU A: ... 238CPU A: writel(newval, ring_ptr); 239CPU A: mmiowb(); /* ensure no other writes beat us to the device */ 240CPU A: spin_unlock_irqrestore(&dev_lock, flags) 241 ... 242CPU B: spin_lock_irqsave(&dev_lock, flags) 243CPU B: writel(newval2, ring_ptr); 244CPU B: ... 245CPU B: mmiowb(); 246CPU B: spin_unlock_irqrestore(&dev_lock, flags) 247</programlisting> 248 249 <para> 250 See tg3.c for a real world example of how to use <function>mmiowb 251 </function> 252 </para> 253 254 <para> 255 PCI ordering rules also guarantee that PIO read responses arrive 256 after any outstanding DMA writes from that bus, since for some devices 257 the result of a <function>readb</function> call may signal to the 258 driver that a DMA transaction is complete. In many cases, however, 259 the driver may want to indicate that the next 260 <function>readb</function> call has no relation to any previous DMA 261 writes performed by the device. The driver can use 262 <function>readb_relaxed</function> for these cases, although only 263 some platforms will honor the relaxed semantics. Using the relaxed 264 read functions will provide significant performance benefits on 265 platforms that support it. The qla2xxx driver provides examples 266 of how to use <function>readX_relaxed</function>. In many cases, 267 a majority of the driver's <function>readX</function> calls can 268 safely be converted to <function>readX_relaxed</function> calls, since 269 only a few will indicate or depend on DMA completion. 270 </para> 271 </sect1> 272 273 </chapter> 274 275 <chapter id="port_space_accesses"> 276 <title>Port Space Accesses</title> 277 <sect1 id="port_space_explained"> 278 <title>Port Space Explained</title> 279 280 <para> 281 Another form of IO commonly supported is Port Space. This is a 282 range of addresses separate to the normal memory address space. 283 Access to these addresses is generally not as fast as accesses 284 to the memory mapped addresses, and it also has a potentially 285 smaller address space. 286 </para> 287 288 <para> 289 Unlike memory mapped IO, no preparation is required 290 to access port space. 291 </para> 292 293 </sect1> 294 <sect1 id="accessing_port_space"> 295 <title>Accessing Port Space</title> 296 <para> 297 Accesses to this space are provided through a set of functions 298 which allow 8-bit, 16-bit and 32-bit accesses; also 299 known as byte, word and long. These functions are 300 <function>inb</function>, <function>inw</function>, 301 <function>inl</function>, <function>outb</function>, 302 <function>outw</function> and <function>outl</function>. 303 </para> 304 305 <para> 306 Some variants are provided for these functions. Some devices 307 require that accesses to their ports are slowed down. This 308 functionality is provided by appending a <function>_p</function> 309 to the end of the function. There are also equivalents to memcpy. 310 The <function>ins</function> and <function>outs</function> 311 functions copy bytes, words or longs to the given port. 312 </para> 313 </sect1> 314 315 </chapter> 316 317 <chapter id="pubfunctions"> 318 <title>Public Functions Provided</title> 319<!-- arch/x86/include/asm/io.h --> 320<refentry id="API-virt-to-phys"> 321<refentryinfo> 322 <title>LINUX</title> 323 <productname>Kernel Hackers Manual</productname> 324 <date>July 2017</date> 325</refentryinfo> 326<refmeta> 327 <refentrytitle><phrase>virt_to_phys</phrase></refentrytitle> 328 <manvolnum>9</manvolnum> 329 <refmiscinfo class="version">4.1.27</refmiscinfo> 330</refmeta> 331<refnamediv> 332 <refname>virt_to_phys</refname> 333 <refpurpose> 334 map virtual addresses to physical 335 </refpurpose> 336</refnamediv> 337<refsynopsisdiv> 338 <title>Synopsis</title> 339 <funcsynopsis><funcprototype> 340 <funcdef>phys_addr_t <function>virt_to_phys </function></funcdef> 341 <paramdef>volatile void * <parameter>address</parameter></paramdef> 342 </funcprototype></funcsynopsis> 343</refsynopsisdiv> 344<refsect1> 345 <title>Arguments</title> 346 <variablelist> 347 <varlistentry> 348 <term><parameter>address</parameter></term> 349 <listitem> 350 <para> 351 address to remap 352 </para> 353 </listitem> 354 </varlistentry> 355 </variablelist> 356</refsect1> 357<refsect1> 358<title>Description</title> 359<para> 360 The returned physical address is the physical (CPU) mapping for 361 the memory address given. It is only valid to use this function on 362 addresses directly mapped or allocated via kmalloc. 363 </para><para> 364 365 This function does not give bus mappings for DMA transfers. In 366 almost all conceivable cases a device driver should not be using 367 this function 368</para> 369</refsect1> 370</refentry> 371 372<refentry id="API-phys-to-virt"> 373<refentryinfo> 374 <title>LINUX</title> 375 <productname>Kernel Hackers Manual</productname> 376 <date>July 2017</date> 377</refentryinfo> 378<refmeta> 379 <refentrytitle><phrase>phys_to_virt</phrase></refentrytitle> 380 <manvolnum>9</manvolnum> 381 <refmiscinfo class="version">4.1.27</refmiscinfo> 382</refmeta> 383<refnamediv> 384 <refname>phys_to_virt</refname> 385 <refpurpose> 386 map physical address to virtual 387 </refpurpose> 388</refnamediv> 389<refsynopsisdiv> 390 <title>Synopsis</title> 391 <funcsynopsis><funcprototype> 392 <funcdef>void * <function>phys_to_virt </function></funcdef> 393 <paramdef>phys_addr_t <parameter>address</parameter></paramdef> 394 </funcprototype></funcsynopsis> 395</refsynopsisdiv> 396<refsect1> 397 <title>Arguments</title> 398 <variablelist> 399 <varlistentry> 400 <term><parameter>address</parameter></term> 401 <listitem> 402 <para> 403 address to remap 404 </para> 405 </listitem> 406 </varlistentry> 407 </variablelist> 408</refsect1> 409<refsect1> 410<title>Description</title> 411<para> 412 The returned virtual address is a current CPU mapping for 413 the memory address given. It is only valid to use this function on 414 addresses that have a kernel mapping 415 </para><para> 416 417 This function does not handle bus mappings for DMA transfers. In 418 almost all conceivable cases a device driver should not be using 419 this function 420</para> 421</refsect1> 422</refentry> 423 424<refentry id="API-ioremap-nocache"> 425<refentryinfo> 426 <title>LINUX</title> 427 <productname>Kernel Hackers Manual</productname> 428 <date>July 2017</date> 429</refentryinfo> 430<refmeta> 431 <refentrytitle><phrase>ioremap_nocache</phrase></refentrytitle> 432 <manvolnum>9</manvolnum> 433 <refmiscinfo class="version">4.1.27</refmiscinfo> 434</refmeta> 435<refnamediv> 436 <refname>ioremap_nocache</refname> 437 <refpurpose> 438 map bus memory into CPU space 439 </refpurpose> 440</refnamediv> 441<refsynopsisdiv> 442 <title>Synopsis</title> 443 <funcsynopsis><funcprototype> 444 <funcdef>void __iomem * <function>ioremap_nocache </function></funcdef> 445 <paramdef>resource_size_t <parameter>offset</parameter></paramdef> 446 <paramdef>unsigned long <parameter>size</parameter></paramdef> 447 </funcprototype></funcsynopsis> 448</refsynopsisdiv> 449<refsect1> 450 <title>Arguments</title> 451 <variablelist> 452 <varlistentry> 453 <term><parameter>offset</parameter></term> 454 <listitem> 455 <para> 456 bus address of the memory 457 </para> 458 </listitem> 459 </varlistentry> 460 <varlistentry> 461 <term><parameter>size</parameter></term> 462 <listitem> 463 <para> 464 size of the resource to map 465 </para> 466 </listitem> 467 </varlistentry> 468 </variablelist> 469</refsect1> 470<refsect1> 471<title>Description</title> 472<para> 473 ioremap performs a platform specific sequence of operations to 474 make bus memory CPU accessible via the readb/readw/readl/writeb/ 475 writew/writel functions and the other mmio helpers. The returned 476 address is not guaranteed to be usable directly as a virtual 477 address. 478 </para><para> 479 480 If the area you are trying to map is a PCI BAR you should have a 481 look at <function>pci_iomap</function>. 482</para> 483</refsect1> 484</refentry> 485 486<!-- lib/pci_iomap.c --> 487<refentry id="API-pci-iomap-range"> 488<refentryinfo> 489 <title>LINUX</title> 490 <productname>Kernel Hackers Manual</productname> 491 <date>July 2017</date> 492</refentryinfo> 493<refmeta> 494 <refentrytitle><phrase>pci_iomap_range</phrase></refentrytitle> 495 <manvolnum>9</manvolnum> 496 <refmiscinfo class="version">4.1.27</refmiscinfo> 497</refmeta> 498<refnamediv> 499 <refname>pci_iomap_range</refname> 500 <refpurpose> 501 create a virtual mapping cookie for a PCI BAR 502 </refpurpose> 503</refnamediv> 504<refsynopsisdiv> 505 <title>Synopsis</title> 506 <funcsynopsis><funcprototype> 507 <funcdef>void __iomem * <function>pci_iomap_range </function></funcdef> 508 <paramdef>struct pci_dev * <parameter>dev</parameter></paramdef> 509 <paramdef>int <parameter>bar</parameter></paramdef> 510 <paramdef>unsigned long <parameter>offset</parameter></paramdef> 511 <paramdef>unsigned long <parameter>maxlen</parameter></paramdef> 512 </funcprototype></funcsynopsis> 513</refsynopsisdiv> 514<refsect1> 515 <title>Arguments</title> 516 <variablelist> 517 <varlistentry> 518 <term><parameter>dev</parameter></term> 519 <listitem> 520 <para> 521 PCI device that owns the BAR 522 </para> 523 </listitem> 524 </varlistentry> 525 <varlistentry> 526 <term><parameter>bar</parameter></term> 527 <listitem> 528 <para> 529 BAR number 530 </para> 531 </listitem> 532 </varlistentry> 533 <varlistentry> 534 <term><parameter>offset</parameter></term> 535 <listitem> 536 <para> 537 map memory at the given offset in BAR 538 </para> 539 </listitem> 540 </varlistentry> 541 <varlistentry> 542 <term><parameter>maxlen</parameter></term> 543 <listitem> 544 <para> 545 max length of the memory to map 546 </para> 547 </listitem> 548 </varlistentry> 549 </variablelist> 550</refsect1> 551<refsect1> 552<title>Description</title> 553<para> 554 Using this function you will get a __iomem address to your device BAR. 555 You can access it using ioread*() and iowrite*(). These functions hide 556 the details if this is a MMIO or PIO address space and will just do what 557 you expect from them in the correct way. 558 </para><para> 559 560 <parameter>maxlen</parameter> specifies the maximum length to map. If you want to get access to 561 the complete BAR from offset to the end, pass <constant>0</constant> here. 562</para> 563</refsect1> 564</refentry> 565 566<refentry id="API-pci-iomap"> 567<refentryinfo> 568 <title>LINUX</title> 569 <productname>Kernel Hackers Manual</productname> 570 <date>July 2017</date> 571</refentryinfo> 572<refmeta> 573 <refentrytitle><phrase>pci_iomap</phrase></refentrytitle> 574 <manvolnum>9</manvolnum> 575 <refmiscinfo class="version">4.1.27</refmiscinfo> 576</refmeta> 577<refnamediv> 578 <refname>pci_iomap</refname> 579 <refpurpose> 580 create a virtual mapping cookie for a PCI BAR 581 </refpurpose> 582</refnamediv> 583<refsynopsisdiv> 584 <title>Synopsis</title> 585 <funcsynopsis><funcprototype> 586 <funcdef>void __iomem * <function>pci_iomap </function></funcdef> 587 <paramdef>struct pci_dev * <parameter>dev</parameter></paramdef> 588 <paramdef>int <parameter>bar</parameter></paramdef> 589 <paramdef>unsigned long <parameter>maxlen</parameter></paramdef> 590 </funcprototype></funcsynopsis> 591</refsynopsisdiv> 592<refsect1> 593 <title>Arguments</title> 594 <variablelist> 595 <varlistentry> 596 <term><parameter>dev</parameter></term> 597 <listitem> 598 <para> 599 PCI device that owns the BAR 600 </para> 601 </listitem> 602 </varlistentry> 603 <varlistentry> 604 <term><parameter>bar</parameter></term> 605 <listitem> 606 <para> 607 BAR number 608 </para> 609 </listitem> 610 </varlistentry> 611 <varlistentry> 612 <term><parameter>maxlen</parameter></term> 613 <listitem> 614 <para> 615 length of the memory to map 616 </para> 617 </listitem> 618 </varlistentry> 619 </variablelist> 620</refsect1> 621<refsect1> 622<title>Description</title> 623<para> 624 Using this function you will get a __iomem address to your device BAR. 625 You can access it using ioread*() and iowrite*(). These functions hide 626 the details if this is a MMIO or PIO address space and will just do what 627 you expect from them in the correct way. 628 </para><para> 629 630 <parameter>maxlen</parameter> specifies the maximum length to map. If you want to get access to 631 the complete BAR without checking for its length first, pass <constant>0</constant> here. 632</para> 633</refsect1> 634</refentry> 635 636 </chapter> 637 638</book> 639