Adding Memory Mapped IO

Go To Last Post
52 posts / 0 new

Pages

Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi All,

I was hoping I could get a pointer in the right direction for starting to add memory mapped IOs on the AVR32.

Essentially the plan is this: with an AVR32 (AP7000) on a custom board, the majority of the free parallel IOs are connected to a FPGA for a variety of board-related tasks. The FPGA will appear to the AVR32 like a big memory space with registers, RAM etc.. There are plenty of IOs and chipselects connected up, I'm just trying to figure out the best way to implement this within Linux. Or would it need to be at the u-boot board initialization level?

Thanks for any and all info!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Any pin you plan to mmap you have to make sure linux is not using them (no kernel driver using them).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Definitely, what file lists what pin is being used for what? is that the at32ap700x.c file in arch/avr32/mach-at32ap ? I've been poking around at that, looking at the data structures and getting a bit of a feel for it..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I should mention that all these signals would be on the EBI bus.. shared with the SDRAM (taken from the ngw100 schematic)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Actually I just found this thread: https://www.avrfreaks.net/index.p...

Looks amazingly easy as long as the EBI stuff is hooked up, no special memory drivers required? Awesome!

Hopefully this is it. I'm going to be writing some code to twiddle EBI bits on the ngw100 board to test this out, since the custom boards haven't been made yet. However, if anyone has anymore comments or suggestions (or if I am on the wrong track!) I welcome hearing it!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
Hopefully this is it. I'm going to be writing some code to twiddle EBI bits on the ngw100 board to test this out, since the custom boards haven't been made yet. However, if anyone has anymore comments or suggestions (or if I am on the wrong track!) I welcome hearing it!
Yeah you can mmap /dev/mem to get access to raw data regions. This is really a dodgy way to do things, accessing raw memory from userspace is not a nice thing to be done, it's a bit of a layering violation.

A much better option is to write an in-kernel driver which can then just directly access the memory region and present an appropriate interface to userspace. For example, any PIO on the fpga can be plugged in to the kernel's gpiolib.

That said, there isn't a great general userspace interface to gpio; I've got a semi-completed one here but I don't have a millisecond to finish it as another project is kind of blowing out and stealing my milliseconds :(

Or if you're going to me needing an interrupt handled (or even if you aren't actually) you can use Userspace IO (UIO). What happens there is that your memory region is linked to a file which you can mmap (like /dev/mem but more 'formal' and 'correct').

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks squidgit!

The UIO stuff intrigues me, since I did get to thinking that mucking about in /dev/mem could cause some hazards like you mention.

Any good pointers to UIO info? I'll dig around on that..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Luckily UIO is one of the better documented things in the kernel. It's docco is in DocBook format so you should make one of 'htmldocs', 'mandocs', 'psdocs', 'pdfdocs' or 'xmldocs' to generate readable docco from the template. The resulting docco will be called uio-howto. where the extension of course depends on what kind of docs you built.

This all must be done in a kernel source tree 2.6.23 or newer.

Note that for a lot of these docco targets you will be prompted for extra tools to be installed.

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Update: after getting some help from Lars@Atmel, I've got a fpga.c file based mostly off flash.c in my board setup directory. My question is do I need a .dev field in my platform_device declaration?

Also, is there a straightforward way to provide a register map in this structure?

Thanks for any info!

#include 
#include 

#include 

static struct smc_timing fpga_timing __initdata = {
   .ncs_read_setup      = 0,
   .nrd_setup           = 40,
   .ncs_write_setup     = 0,
   .nwe_setup           = 10,

   .ncs_read_pulse      = 80,
   .nrd_pulse           = 40,
   .ncs_write_pulse     = 65,
   .nwe_pulse           = 55,

   .read_cycle          = 120,
   .write_cycle         = 120,
};

static struct smc_config fpga_config __initdata = {
   .bus_width     = 3,
   .nrd_controlled      = 1,
   .nwe_controlled      = 1,
   .nwait_mode    = 0,
   .byte_write    = 0,
   .tdf_cycles    = 2,
   .tdf_mode      = 0,
};


static struct resource fpga_resource = {
   .start      = 0x04000000,    // CS4 Mem space
   .end        = 0x07ffffff,
   .flags      = IORESOURCE_MEM,
};

static struct platform_device fpga_device = {
   .name    = "physmap-fpga",
   .id      = 0,
   .resource   = &fpga_resource,
   .num_resources = 1,
   // .dev??
};

/* This needs to be called after the SMC has been initialized */
static int __init atngw100_fpga_init(void)
{
   int ret;

   smc_set_timing(&fpga_config, &fpga_timing);
   ret = smc_set_configuration(4, &fpga_config);
   if (ret < 0) {
      printk(KERN_ERR "atngw100: failed to set fpga timing\n");
      return ret;
   }

   platform_device_register(&fpga_device);

   return 0;
}
device_initcall(atngw100_fpga_init);
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
Update: after getting some help from Lars@Atmel, I've got a fpga.c file based mostly off flash.c in my board setup directory. My question is do I need a .dev field in my platform_device declaration?
You know, I'm not convinced you want to register a device here at all. You certainly want to set the SMC timing configuration but the struct platform_device and the call to platform_device_register() will make Linux think that _flash_ sits at this address and will try and access it as an MTD. When it doesn't respond to CFI-style commands it will get very angry ;).

You can create a platform driver with the name "physmap-fpga" and from there provide the userspace interface (eg UIO setup code) if you want. In fact that's probably not a bad idea...

In fact no, the struct platform_data as it stands will try and match the device to a driver called "physmap-fpga" ("physmap-flash" is actually the name of the MTD driver the flash wants to be bound to). As such the device will never get bound to a driver and it'll actually probably work, albeit with a bunch of extra clock cycles spent trying to find your device a home.

If you /were/ to make this look like an MTD then you would need to copy the flash.c a bit more closely and create a struct physmap_flash_data which then got stuck in to struct platform_device.dev.platform_data

JamesLS wrote:
Also, is there a straightforward way to provide a register map in this structure?
No, though I can't think how you might use it even if there were. How did you plan on accessing any register map from userspace? I mean, if you're writing an in-kernel driver then you tend to just have a big header with the register map and a couple of accessor macros (drivers/spi/atmel-spi.h is the first one which comes to mind). If userspace is to be entrusted with driving the device it's kind of assumed it already knows _how_ to drive it. If it had to query the kernel for that info then you'd usually just put the _whole_ driver in the kernel and be done with it :)

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Okay, I think I get it a bit more now.. I was thinking the platform_device stuff was required for any static memory type accesses, but I guess I can prune down that file even further, removing the platform_device calls and just having the smc_set_timing and smc_set_configuration.

Following that I *should* be able to mmap to the FPGA CS location (0x04000000 onwards) in userspace, is that correct? I'm not sure why I thought it would be better to declare the register space in the kernel, you're right in that it doesnt make much sense that way.

BTW - Thanks for the patience with my noob-ness, I'm in no way a linux hacker (can't you tell??).. Sorta been thrown in the deep end here without having even done much in the way of C programming since leaving school many many moons ago! Definitely learning a lot here, though..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
Following that I *should* be able to mmap to the FPGA CS location (0x04000000 onwards) in userspace, is that correct?
Indeed you should be able to map the appropriate range of /dev/mem and run with it.

I will just take this opportunity to once again poke you in the direction of UIO; there have been patches to the kernel I've seen fly by lately which aim to limit the address space that /dev/mem can actually map. UIO will be much more future-proof and the entire driver will be well under 100 lines without an interrupt. That said, I appreciate that as noobie you're probably keen to get out of the kernel as quick as possible and retreat to the safety of userland :D

JamesLS wrote:
BTW - Thanks for the patience with my noob-ness, I'm in no way a linux hacker (can't you tell??).. Sorta been thrown in the deep end here without having even done much in the way of C programming since leaving school many many moons ago! Definitely learning a lot here, though..
No prob :). Welcome back to that most wonderful mystical word that is C! (K&R C is 30 this year, I think I'll throw it a party. Any excuse to get computer nerds to the uni pub ;))

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

squidgit wrote:
JamesLS wrote:
Following that I *should* be able to mmap to the FPGA CS location (0x04000000 onwards) in userspace, is that correct?
Indeed you should be able to map the appropriate range of /dev/mem and run with it.

Excellent! Flee! fleeeee back to userland! ;)

Quote:

I will just take this opportunity to once again poke you in the direction of UIO; there have been patches to the kernel I've seen fly by lately which aim to limit the address space that /dev/mem can actually map. UIO will be much more future-proof and the entire driver will be well under 100 lines without an interrupt. That said, I appreciate that as noobie you're probably keen to get out of the kernel as quick as possible and retreat to the safety of userland :D

The UIO stuff definitely does have me intrigued and I will follow it up after I get this first cut going.. Basically our boards are going to be back soon-ish so I want a good first cut to get up and going. Baby steps! ;)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Update!

The UIO stuff was actually a lot easier to get implemented than I had feared, and I have it loading happily in the kernel boot sequence. Performing the open/mmap and reading/writing to the addresses seems quite straightforward, but I did have one question..

If I want to write to a single bit in a register (memory location from the given offset in this case) would my best method be to perform a read of that location, mask out the bit and write back to the memory location?

Thanks!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Think I found the answer to my question there, in this document: http://www.atmel.com/dyn/resourc...

if the memory section is mapped to fd as the pointer, would

*(fd+address) |= 0x0040; //sets bit 6
*(fd+address) &= 0x0040; //clears bit 6

be correct?

Thanks!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
Update!

The UIO stuff was actually a lot easier to get implemented than I had feared, and I have it loading happily in the kernel boot sequence.

Fantastic! Yeah it is pretty easy huh! I just got a UIO driver for our board in linux mainline this morning ( http://www.gossamer-threads.com/... ). Given it's kernel code it kinda has to be GPL'd and given that, it's totally worth submitting it upstream so you get free reviews and don't have to maintain it yourself during kernel upgrades! If you want to attach it here/pm/email it to me I can give it an eye-over. May as well while it's all fresh in my mind. Does yours need to handle IRQs at all? I'm thinking of just doing a simple UIO stub driver for people who don't need irqs but I kinda need a valid use-case.
JamesLS wrote:
If I want to write to a single bit in a register (memory location from the given offset in this case) would my best method be to perform a read of that location, mask out the bit and write back to the memory location?
Yup, read-modify-write is the usual case. If you want to get your ASM on you might have more options. What you've got is _almost_ right, should be:

*(fd+address) |= 0x0040; //sets bit 6
*(fd+address) &= ~0x0040; //clears bit 6, note the tilde (~) 

Of course the 'bit 6' assumes the least significant bit is the 0th bit. To put it another way, using 0x40 sets or clear the second most significant bit of an 8-bit word.

After all these years I still have a pad of paper next to my desk covered with binary numbers and boolean algebra as I try and get this kind of stuff right!

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey Squidgit, thanks!

I had a bit of an epiphany last night, I'm dealing with all the register values etc within the OS userspace in this case. I'm kind of more used to dealing with them directly within registers, so things like actively masking out things while I'm doing a write is what I'm used to. When I'm running a userspace program fetching the data and writing it back I can pretty much do whatever with it, operations wise. This makes it a lot more flexible!

As for the uio driver code, it's pretty short so I can just clip and paste it here:

/*
 * driver/uio/uio_fpga.c
 *
 * Copyright(C) 2008, Teradici Corporation 
 * 
 * Based on
 * Copyright(C) 2005, Benedikt Spranger 
 * Copyright(C) 2005, Thomas Gleixner 
 * Copyright(C) 2006, Hans J. Koch 
 *
 * Userspace IO fpga driver
 *
 * This driver is to interface to a CPLD device via the SMC bus
 * on the Atmel AVR32 MCU. 
 *
 * Registers and functions to be defined in user space. 
 *
 * Licensed under the GPLv2 only.
 */


#include 
#include 
#include 

#define UIO_FPGA_MEMSIZE 8192

#define FPGA_MEM_ADDR 0x0400000 // CS4 


static struct uio_info uio_fpga_info = {
	.name = "uio_fpga",
	.version = "0.0.0",
	.irq = UIO_IRQ_CUSTOM,
};


static int uio_fpga_probe(struct device *dev)
{
	int ret;
	uio_fpga_info.mem[0].addr = FPGA_MEM_ADDR;
	printk("uio_fpga_probe( %p )\n", dev );
	if (!uio_fpga_info.mem[0].addr)
		return -ENOMEM;

	uio_fpga_info.mem[0].memtype = UIO_MEM_PHYS;
	uio_fpga_info.mem[0].size = UIO_FPGA_MEMSIZE,

	printk("uio_register_device...\n");
	if (uio_register_device(dev, &uio_fpga_info)) {
		printk("uio_register_device failed\n");
		kfree((void *)uio_fpga_info.mem[0].addr);
		return -ENODEV;
	}

	return 0;

}

static int uio_fpga_remove(struct device *dev)
{
	uio_unregister_device(&uio_fpga_info);
	kfree((void *)uio_fpga_info.mem[0].addr);
	uio_fpga_info.mem[0].addr = 0;
	uio_fpga_info.mem[0].size = 0;
	return 0;
}

static void uio_fpga_shutdown(struct device *dev)
{

}

static struct platform_device *uio_fpga_device;

static struct device_driver uio_fpga_driver = {
	.name		= "uio_fpga",
	.bus		= &platform_bus_type,
	.probe		= uio_fpga_probe,
	.remove		= uio_fpga_remove,
	.shutdown	= uio_fpga_shutdown,
};

/*
 * Main initialization/remove routines
 */
static int __init uio_fpga_init(void)
{
	printk("uio_fpga_init( )\n" );
	uio_fpga_device = platform_device_register_simple("uio_fpga", -1,
							   NULL, 0);
	if (IS_ERR(uio_fpga_device))
		return PTR_ERR(uio_fpga_device);

	return driver_register(&uio_fpga_driver);
}

static void __exit uio_fpga_exit(void)
{
	platform_device_unregister(uio_fpga_device);
	driver_unregister(&uio_fpga_driver);
}

module_init(uio_fpga_init);
module_exit(uio_fpga_exit);

MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("UIO fpga driver");

Most of it is cribbed directly from the uio_cif or uio_dummy init routines. I'd appreciate a critical glance-over for sure!

Thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Righteo, no worries.

Technically it looks just fine, though the device registration is typically done remotely from the driver registration (usually in the board code), as is the memory region setup.

I've attached something I just hammered out to give the idea; pretty much in the board code you'd do something like:

static struct resource uio_resource[] = {
   {
		.start	= 0x04000000,
		.end	= 0x07ffffff,
   },
};

static struct platform_device uio_device = {
	.name		= "uio-noirq",
	.id		= -1,
	.resource	= uio_resource,
	.num_resources	= ARRAY_SIZE(uio_resource),
};

...

platform_device_register(&uio_device);

In this setup there's no way to pass version and name strings across, but I didn't say it was perfect :-). It just means that the board-specific stuff (i.e. the fact there's a UIO-driven device on CS4) is in the board code and the driver itself is left nice and generic.

With what you're using:

    uio_fpga_info.mem[0].addr = FPGA_MEM_ADDR;
   printk("uio_fpga_probe( %p )\n", dev );
   if (!uio_fpga_info.mem[0].addr)
      return -ENOMEM; 

All you're really testing here is whether you've set the #defines up alright so you might either want a BUILD_BUG_ON(!FPGA_MEM_ADDR) to catch it at build time, change the error to -EINVAL for invalid argument or just drop the check completely.

   if (uio_register_device(dev, &uio_fpga_info)) {
      printk("uio_register_device failed\n");
      kfree((void *)uio_fpga_info.mem[0].addr);
      return -ENODEV;
   } 

You never actually kmalloc'd that address range to kfree()ing it will cause an oops.

static int uio_fpga_remove(struct device *dev)
{
   uio_unregister_device(&uio_fpga_info);
   kfree((void *)uio_fpga_info.mem[0].addr);

Same here

   uio_fpga_info.mem[0].addr = 0;
   uio_fpga_info.mem[0].size = 0;
   return 0;
} 

Redundant assignments, the uio_info is unregistered anyway, nothing cares.

static void uio_fpga_shutdown(struct device *dev)
{

}

No need for an empty function, remove this and the .shutdown assignment below.

The rest is just what I mentioned before regarding the seperation of device and driver, no worries :-)

-S.

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Excellent, thanks very much for the critique! I agree that it is much nicer to keep it generic and just have it called in my setup.c in the boards dir, so I'll change it to implement that way.

I was thinking the next step would be to add the IRQ in there, it might be useful (although not 100% required - polling will work fine)

Thanks again!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey Squidgit,

hit a snag when trying out your uio_noirq routine, I get "uio-noirq uio-noirq: No memory resources specified" in the dmesg log.. I put the structs in my setup.c and platform_device_register in the init routine.. I'm probably missing something obvious here.. Should the .id field be unique or is -1 a self-assigning type of thing?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

-1 should be self-assign but try some other number. The thing I posted was more where-I'm-going rather than an end result. Should have something, well, tested, soon.

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ah gotcha, I'll monkey around with it too and see where I can get with it.. Doing an IRQ version too? I'm working on interfacing into a CPLD based spi controller (yes, they want the spi controller on the cpld instead of passing through from the atmel - I tried!) and it would be useful to know when the controller is ready for the next page of data etc..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
Ah gotcha, I'll monkey around with it too and see where I can get with it.. Doing an IRQ version too?
IRQ version is much harder to get generic as the interrupt-silencing procedure is completely device-dependent. As I mentioned before, I've just got a small UIO interface in to the mainstream kernel, you can probably cut and paste that and change the ISR to suit your app.

You can find the posting of the final version here: http://lkml.org/lkml/2008/3/13/127

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Good stuff with the IRQs, something to chew on there!

I'm wondering now if there is a way I can fake out the driver I'm developing, so I can continue development + debug before the boards with the FPGAs come back. Is there any way to create a simple /dev/uio substitute on the board I can pre-load with registers and have the routines just run on that? Just as a way to prove out my read/write routines before there is real fpga code to try it out on..

UPDATE: I think I'll just create a char driver (good page for newbies like me: http://linuxgazette.net/125/mishra.html) and get to testing on a scratch-pad like memory device..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Exellent, I got the "scull" driver going from the "Linux Device Drivers" book to act as my virtual mem space for the userspace driver tests.. Lookin good!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Another question on the mmaping external memory.. if I have a device mmap'd, can I do the following type of thing:

for (i=0; i<16; i++) {
        mem[start_addr+i] = write_val_int;
    }

ie, direct memory writes? Or do I need to use read/write type functions?

Also, should I be able to see the /dev/uio0 device in /proc/iomem if it is set up properly? I get the following:

# cat /proc/iomem
00000000-007fffff : physmap-flash.0
  00000000-007fffff : physmap-flash.0
10000000-11ffffff : System RAM
  10013000-10181cc9 : Kernel code
  10181cca-1020f5ef : Kernel data
  10210000-102a5fff : Framebuffer
ff200000-ff20ffff : dmaca.0
ffe00000-ffe003ff : atmel_spi.0
ffe00400-ffe007ff : atmel_spi.1
ffe01000-ffe013ff : atmel_usart.0
  ffe01000-ffe013ff : atmel_serial
ffe02800-ffe02bff : pio.0
ffe02c00-ffe02fff : pio.1
ffe03000-ffe033ff : pio.2
ffe03400-ffe037ff : pio.3
ffe03800-ffe03bff : pio.4
fff00000-fff0007f : at32_pm.0
fff00080-fff000af : at32ap700x_rtc.0
fff000b0-fff000cf : at32_wdt.0
fff00100-fff0013f : at32_eic.0
fff00400-fff007ff : intc.0
fff00c00-fff00fff : atmel_tcb.0
fff01000-fff013ff : atmel_tcb.1
fff01800-fff01bff : macb.0
fff02400-fff027ff : atmel_mci.0
fff03400-fff037ff : smc.0

Now, I'm trying to map this into CS2 (0x08000000-0x0BFFFFFF) range.. lsuio shows that I've mapped into there, just wondering if I should be able to see it elsewhere..

thanks!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
Another question on the mmaping external memory.. if I have a device mmap'd, can I do the following type of thing:

for (i=0; i<16; i++) {
        mem[start_addr+i] = write_val_int;
    }

ie, direct memory writes? Or do I need to use read/write type functions?

That's the only way you can do it :-). Read and write uses, well, read and write, not mmap. A read on a UIO device will block 'till an interrupt is received, a write will clear the interrupt count. You can't use either function to actually communicate with the device.
JamesLS wrote:
Also, should I be able to see the /dev/uio0 device in /proc/iomem if it is set up properly? I get the following:
Only if at some stage you've ioremap()'d it. In my driver I do just that, IIRC you don't. That's fine as ioremap is functionally a NOP on AVR32 (it just does this paperwork kinda stuff) so it should work regardless.

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Excellent, many thanks Squidgit! I'm really looking forward to getting the actual boards back so I can see it in action.. But for now using the /dev/scull device is pretty handy for a simple char driver interface.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah I'm really hoping that the LDD boys release a 4th edition of that book some time soon, it's all kinda out-of-date (as you'd expect with something as moving-target as the Linux Kernel).

Or maybe just a big errata page somewhere to help bring the thing up to date..

Ah well, in the mean time the concepts and demo drivers are still fab.

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah a UIO chapter would be excellent, and perhaps one on embedded device drivers too.. It's still been a big help, although the first couple times I read through it, it was pretty opaque to me.. Definitely something that is easier to ease into with liberal use of the examples and posting on forums such as this one!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah I've got a little tablet lappy thing and I spent probably a month's work of computing lectures sitting up the back, tablet on my knees reading and re-reading that darn thing. Glad I did.

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi all,

A couple more questions on the UIO stuff. Specifically to do with the word width on the bus. If I have the UIO device setup for a 16 bit data bus width, do I need to set the address bits 0x0000 for first register, 0x0002 for second register? Or can I just go 0, 1, 2 etc

I'm still using the /dev/scull dummy driver and it has no problems with the 0,1,2 method.

I guess to sum it up I'm wondering if the uio mmap takes care of the whole byte enabling thing on its own when interfacing to an external device.

Thanks for any info!

EDIT - After reading the SMC section in the AP7000 doc a little bit I have the feeling that I'll need to just use even addresses, and A[0] won't be used at all. That sound right?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
EDIT - After reading the SMC section in the AP7000 doc a little bit I have the feeling that I'll need to just use even addresses, and A[0] won't be used at all. That sound right?
All addresses are byte-wise, yup. If you want to read blocks of 2-bytes, you increment addresses in lots of 2 :-)

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

squidgit wrote:
JamesLS wrote:
EDIT - After reading the SMC section in the AP7000 doc a little bit I have the feeling that I'll need to just use even addresses, and A[0] won't be used at all. That sound right?
All addresses are byte-wise, yup. If you want to read blocks of 2-bytes, you increment addresses in lots of 2 :-).

Gotcha! It threw me off a little because it seems that /dev/scull driver isn't bytewise, ie you set it to read 16 bits out at a time and it will just read consecutive addresses.. I'll paste my c routines here in case I'm doing anything weird:

/*
*
*  Routines for reading/writing FPGA registers
*
*/
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define DEBUG 0
#define VERSION "0.1"
#define FPGA_START 0x08000000
#define FPGA_END   0x0BFFFFFF
#define FPGA_SIZE  FPGA_END - FPGA_START


/* FPGA Struct for mmap operations */
struct fpga {
    int fd;
    size_t size;
    unsigned short *io;
};

// fpga_open - open the FPGA static mem device
struct fpga *fpga_open (const char *name) {
    struct fpga *fi;
    
    fi = (struct fpga *)calloc (1, sizeof (struct fpga));
    if (!fi)
        goto out;
    
    fi->size = FPGA_SIZE;
    
    fi->fd = open(name, O_RDWR);
    if (fi->fd == -1)
        goto out_free;
    
    fi->io = mmap(NULL, fi->size, PROT_READ | PROT_WRITE, 
        MAP_SHARED, fi->fd, 0);
    if (fi->io == MAP_FAILED)
        goto out_close;
    
    if (DEBUG) {
        printf("Map succeeded\n");
        printf("Memory Map Size: %d\n",fi->size);
        printf("File pointer: %d\n",fi->fd);
        printf("MMap Pointer: %d\n",fi->io);
    }
       
    return fi;
    
out_close:
    close (fi->fd);
out_free:
    free (fi);
out:
    return NULL;
}

// fpga_close - close the fpga interface
int fpga_close(struct fpga *fi) {
  
    if(!fi)
        return -1;
    munmap(fi->io, fi->size);
    close(fi->fd);
    
    free(fi);
    
    return 0;    
}
 
// fpga_get_io - get pointer to mmaped IO
unsigned short *fpga_get_io (struct fpga *fi) {
    if(!fi)
        return NULL;
    
    return fi->io;
}

unsigned short read_16_reg(unsigned short *mem, int addr) {
    if (DEBUG)
        printf("%04X\n",mem[addr]);
        
    return mem[addr];
}

void write_16_reg(unsigned short *mem, int addr, unsigned short val) {
    mem[addr] = val;
}

I have these commands SWIGged and I call them through python - works great! (on the /dev/scull so far anyway - we'll see what happens on the physical boards)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
Gotcha! It threw me off a little because it seems that /dev/scull driver isn't bytewise, ie you set it to read 16 bits out at a time and it will just read consecutive addresses..
Which version of scull are you using? Is there one with mmap'd memory?
JamesLS wrote:
I'll paste my c routines here in case I'm doing anything weird:
At a first pass looks fine, g'luck :-)

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

squidgit wrote:
JamesLS wrote:
Gotcha! It threw me off a little because it seems that /dev/scull driver isn't bytewise, ie you set it to read 16 bits out at a time and it will just read consecutive addresses..
Which version of scull are you using? Is there one with mmap'd memory?

It's just the basic char driver, and I just call fpga_open("/dev/scull") to mmap to the driver. Seems to work ok, I can read/write to and from the memory locations. The real boards may require a little more twiddling, however!

Quote:
JamesLS wrote:
I'll paste my c routines here in case I'm doing anything weird:
At a first pass looks fine, g'luck :-)

Thanks! I'm sure there will be a couple more before the gold code is ready ;)

I must say though, I am so thankful for SWIG.. I've become a huge fan of python lately and being able to just wrap whatever low level C functions to be accessed through a nice scripting language like python speeds development time up immensely!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
It's just the basic char driver, and I just call fpga_open("/dev/scull") to mmap to the driver. Seems to work ok, I can read/write to and from the memory locations. The real boards may require a little more twiddling, however!
And just to plug a loose end the argument to the mem argument to {read,write}_16_reg is some fp->io? Coolies.
JamesLS wrote:
I must say though, I am so thankful for SWIG.. I've become a huge fan of python lately and being able to just wrap whatever low level C functions to be accessed through a nice scripting language like python speeds development time up immensely!
It is a great invention, though C is my speedy-devel language anyway ;-). Or occasionally Java I guess.

I use it so I can write support libraries in one language and very easily offer bindings for what ever language that anyone might want. Huzzah!

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

And just to plug a loose end the argument to the mem argument to {read,write}_16_reg is some fp->io? Coolies.

You bet! I use the following python class for the fpga init, and it returns a very struct-like class:

class cpldUio:
    """Connect/disconnect to userspace io object"""
    def __init__(self, device):
        self.device = device
        self.fpga = fpga_routines.fpga_open(self.device)
        self.ptr  = fpga_routines.fpga_get_io(self.fpga)
        self.regMap  = deviceRegMap(self.ptr)
        
    def close(self):
        """Close interface to fpga"""
        fpga_routines.fpga_close(self.fpga)

so the .ptr returned by this class is fed into whatever register read/write.

One strange thing I found with swig is that it did not really like uint16_t types for whatever reason, I think it was just something I was doing wrong in the wrapper .i file, however. Once I just replaced them with unsigned shorts all was well.. Wasn't really worth an in-depth investigation on that I figured.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Augh! Board issues!

I don't seem to be getting a CS2 assert when I do a read/write into my uio driver..

The driver is loaded, /dev/uio0 is there, smc_set_configuration(FPGA_CS, &fpga_config) ran, I get happy messages in my kernel log, and yet no response on CS2.. I figured out my chip select on the NGW100 board and no luck there either..

Oh wait.. Do I need to do something like configure the PE25 gpio port to be CS2 perhaps? Doy! I will check into that first..

Update: Yep, had to do the at32_setup_periph routine on the relevant pins.. working now!

One odd thing is I'm seeing a lot of reads from each individual read access on the bus when I look at it on the LA.. the FPGA is having some difficulty speaking atmel-ese at the moment too but the FPGA designer is on it..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
One odd thing is I'm seeing a lot of reads from each individual read access on the bus when I look at it on the LA.. the FPGA is having some difficulty speaking atmel-ese at the moment too but the FPGA designer is on it..
What addresses are the reads at? And is it consistent each time you do the same read?

I'm sure UIO specifies the relevant pages to be uncached, but it kinda sounds like the CPU might be populating a whole cacheline. Maybe it would be worth specifying the UIO map address explicitly to be through an uncached segment? Actually I have no idea whether that would even work once Linux is up and flying, but could be worth a shot.

Actually, what kernel version are you using? I don't think UIO correctly set the vm page protection flags until recently. Can you try a really recent kernel (maybe even .25-rc8 or latest git) or add

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

right before the call to remap_pfn_range() in uio.c:uio_mmap_physical()?

Thx,
-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey Squidgit,

I do a read at pretty much any location and I see 3 sets of reads going on on the logic analyzer. It doesn't seem to hurt the reads though, I've got it successfully reading/writing from/to the registers on the FPGA.

I'm currently on 2.6.24.3 kernel. I'll do a quick rebuild on the kernel with that line tomorrow, thanks!

It was a long day but at least the atmel + fpga are on speaking terms now! :) My nifty python web front end works great for reading/writing the registers - although I'd better password protect it since the FPGA registers control the power supplies on the board, eep. ;)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey sounds great :D.

hmm, cache line size is 32 bytes, ISTR your bus width is 2 bytes; you'd need 16 reads to populate. So 3 reads doesn't sound like a cache population. Ah well, would be interesting to see if the page protection flag change makes a difference..

Writes are unaffected?

If this is a problem for any reason it'd be worth pinging the kernel@avr32linux.org mailing list; you'll find more educated people than myself live at the other end of that address ;-)

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

squidgit wrote:
Hey sounds great :D.

hmm, cache line size is 32 bytes, ISTR your bus width is 2 bytes; you'd need 16 reads to populate. So 3 reads doesn't sound like a cache population. Ah well, would be interesting to see if the page protection flag change makes a difference..

Actually that sounds right that it does 16 reads.. I was a little discombobulated last night when I was posting (cooped up in the evening in a lab with the building HVAC off will do that to ya!) but I meant to post it was 3 groups of reads, it definitely could have been 16 accesses.

Looking at the LA after that mod I see only a single read! woohoo!

Quote:

Writes are unaffected?

Yep, they just have a single CS pulse on a write
Quote:

If this is a problem for any reason it'd be worth pinging the kernel@avr32linux.org mailing list; you'll find more educated people than myself live at the other end of that address ;-)

I think your fix got it! I should subscribe to that list anyway just to get more info on the sundry details of the avr32 kernel..

The FPGA designer did some fiddling to shorten the read cycle times and now it is broke again.. sigh! Register writes still work, but the reads go a little garbly.. Thank the maker for Synplicity's Identify!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JamesLS wrote:
squidgit wrote:
Hey sounds great :D.

hmm, cache line size is 32 bytes, ISTR your bus width is 2 bytes; you'd need 16 reads to populate. So 3 reads doesn't sound like a cache population. Ah well, would be interesting to see if the page protection flag change makes a difference..

Actually that sounds right that it does 16 reads.. I was a little discombobulated last night when I was posting (cooped up in the evening in a lab with the building HVAC off will do that to ya!) but I meant to post it was 3 groups of reads, it definitely could have been 16 accesses.

Looking at the LA after that mod I see only a single read! woohoo!

Woohoo indeed!
Quote:
Quote:

If this is a problem for any reason it'd be worth pinging the kernel@avr32linux.org mailing list; you'll find more educated people than myself live at the other end of that address ;-)

I think your fix got it! I should subscribe to that list anyway just to get more info on the sundry details of the avr32 kernel..

Right, good plan. It's fairly low traffic and you quite often see solutions to problems you didn't even know you had ;-)
Quote:

The FPGA designer did some fiddling to shorten the read cycle times and now it is broke again.. sigh! Register writes still work, but the reads go a little garbly.. Thank the maker for Synplicity's Identify!
Gah, buggrit. Good to hear that, by and large, it's flying nice.

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Question for Squidgit: Are there any known speed issues with the userspace IO drivers?

I ask because I'm trying to program flash devices that are connected to the FPGA, and my software writes 256byte pages into memory (swigged C routine called in python). As far as I can tell, each 256byte write is taking about 0.04 seconds - although using the system time for the calculations might be way off. Anyway it ends up taking about 25+ minutes to program a 6 megabyte bitfile into flash.

The C routine I call is as follows:

void write_256_page(unsigned short *mem, int pgSize, int addr, unsigned short *valIn) {
    int i, addrIn;
    addrIn = addr;
    if (DEBUG) {
        printf("Page Size is: %d\n",pgSize);
        printf("Start Address is 0x%08X\n",addrIn);
    }
    for (i = 0;i < pgSize; i++) {
        if (DEBUG) {
            printf("Writing 0x%04X into address 0x%08X\n",valIn[i],(addrIn << 1));
        }
        
        mem[addrIn] = valIn[i];
        addrIn += 1;
    }
}

Anything that looks blatantly slow there?

Thanks!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No, speed should be fine; the memory pages are just mapped in your your vma so writing to them is just a write to memory (albeit an uncached write).

You can use memcpy rather than the for loop or, if you want to use the loop, then swapping the if(DEBUG) for a preprocessor #if DEBUG (and of course making debug a preprocessor macro too) will help a little.

Also, since the memory write is uncached it will be a little slower; writes will be directly limited by the SMC setup and EBI usage. SMC setup is of course a matter for you to fight out with your FPGA dude. As for limiting EBI usage, well, making the executing code footprint as small as possible is about all you can do.

How big is each chunk of data that you call in to the 'C' code with? Making this bigger and limiting the indirect function calls and, lets face it, time spent in python will help speed too.

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the info Squidgit!

It's good to know the uio isn't the issue here. There were some questions here raised if the userspace drivers causes a bunch of extra overhead..

I'll try out the changes you recommend with memcpy and changing to a preprocessor.

Another issue could be that I am fetching from a network mounted file, although I grab the whole file and put it into memory so that really shouldn't be an issue on a per-operation deelie.

Each page I write into that C code is 256 bytes, it's to program a SPI flash device that is hanging off the FPGA. Of course, it would have been *much* easier if we had just connected the SPI to the CPU directly, but that idea was pooh-poohed because the SPI was "too slow" when compared to the 16 bit EBI bus.. mm hmm..

Also with regard to the python I try to make any of the actual memory access operations fully C, so I'm hoping that wouldn't be an issue. Of course, I have been knocking around the idea of doing a fully C flash program operation..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Gulp, I think I'm in over my head here.. Would the best way to perform a memcpy be mmap the area I need and then do a memcpy into the destination?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yup. You'll replace a loop like

for (i=0; i

With a simple

memcpy(dest, src, len);

Everything else (the mmap etc) will be exactly the same :-)

When you say you're loading the file to memory, do you have a massive array which you load it in to or do you use some kind of (pythony) prefetch trick? 'coz in the limited memory world of the AVR32 you might find prefetched pages being dropped from the pagecache before you use them. This makes the prefetch useless.

-S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi, I'm trying to use as a map device the cpld on my board. I've tried to follow this interesting thread.
At the end I put the followings line on board setup file:

// UIO
static struct resource uio_resource[] = {
   {
      .start   = 0x08000000,
      .end     = 0x0bffffff,
   },
};

static struct platform_device uio_device = {
   .name      = "uio-cpld",
   .id      = -1,
   .resource   = uio_resource,
   .num_resources   = ARRAY_SIZE(uio_resource),
};

...
// UIO
platform_device_register(&uio_device); 

I've written the driver part following the file attached on this thread "generic UIO interface with no irq".
As result the followng error appear on kernel boot:

Quote:

uio_cpld_init( )
kobject (90238c5c): tried to init an initialized object, something is seriously wrong.
Call trace:
[<9001abc4>] dump_stack+0x18/0x20
[<900e6118>] kobject_init+0x28/0x5c
[<90112d72>] device_initialize+0x16/0x74
[<90115ae8>] platform_device_register+0xc/0x14
[<9000b4e8>] uio_cpld_init+0x10/0x1c
[<900003e2>] kernel_init+0x8e/0x1c8
[<90023b5c>] do_exit+0x0/0x420

Ô1•Kl—øÌþðàk€Ðô°>: failed to claim resource 0


Any suggestions or help?
Thanks

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jaws75 wrote:
Any suggestions or help?

Stop ignoring the compiler warnings.

static struct platform_driver uio_cpld_device = {

(...)

   return platform_device_register(&uio_cpld_device);

You're trying to register a platform driver as a platform device.

Pages