io prediction: an userspace proof-of-concept

It was recently added in the energy aware scheduler kernel tree the io latency tracking mechanism. The purpose of this framework is to provide a way to predict the IO latencies, in other words try to guess how long we will be sleeping on waiting an IO. When the cpu goes idle, we know how long is the sleep duration with the timer but then we rely on some statistics in the menu governor, which is part of the cpuidle framework for other wakes up. The io latency tracking will provide an additional information about the length of the expected sleep time, which combined with the timer duration should give us a more accurate prediction. The first step of the io latency tracking is simply using a sliding average of the values, which is not really accurate as it is not immune against IOs ping pong or big variations. The proposed POC here is to provide an userspace approach of the algorithm in order to validate the approach and replace the current sliding average by this predictor. The program itself contains two parts: - one part is the simulation of the io. - the other part which is the algorithm itself to be included in the kernel code. For this reason, it is written in kernel style. 1. Simulation of the IO Pretty simple. Writes a big file and reads at some random places a chunk of the file. Compute how long it took to do the read and use this latency for the algorithm. 2. Prediction algorithm Each latency is grouped into bucket which represent an interval of latency. Why ? Because we don't want to take each latency individually and compute the statistics on it. It does not make sense, take a lot of memory, computation time, for finally a result which is mathematically impossible to find. It is better to use an interval. Eg. 186us, 123us, 134us can fall into the bucket [100 - 199], in which case the latency is 199us. The size of the bucket is the bucket interval and represent the resolution of the statistic model. Eg with a bucket interval of 1us, it leads us to do statitics on *all* numbers, with of course a bad prediction. Choosing the size of the bucket interval vs the idle sleep time is the tradeoff we have to find. I believe with a 200us bucket intervall, we still have good prediction and cover the idle state target residency. The buckets are dynamically created and stored into a list. A new bucket is added at the end of the list. This list is always moving depending on the number of successives hits a bucket will have. The more a bucket is successively hit, the more it will be the first element of the list. The guessed next latency, which is a bucket (understand it will be between eg. 200us and 300us, with a bucket interval of 100us), is retrieved from the list. Each bucket present in the list will mark a score, the more the hits a bucket has, the bigger score it has. *But* this is weighted by the position in the list. The first elements will have more weight than the last ones. This position is dynamically changed when a bucket is hit several times. Example the following latencies: 10, 100, 100, 100, 100, 100, 10, 10 We will have two buckets: 0 and 1. 10 => bucket0(1) 100 => bucket0(1), bucket1(1) 100 => bucket0(1), bucket1(2) 100 => bucket0(1), bucket1(3) 100 => bucket0(1), bucket1(4) * 100 => bucket1(5), bucket0(1) 10 => bucket1(5), bucket0(2) 10 => bucket1(5), bucket0(3) At (*), bucket1 reached 5 successive hits at has been move at the beginning of the list. The first element became the second one. Some measurements with bucket interval 1ms, 500us, 200us and 100us. ------------------------------- | 1ms | 500us | 200us | 100us | -------------------------------------------------- | SSD 6Gb/s | 99.7% | 99.9% | 99.7% | 85.7% | | SD card class 10 | 97.7% | 96.8% | 95.5% | 67.6% | | SD card class 4 | 54.3% | 55.8% | 29.5% | 31.4% | | HDD on USB | 93.6% | 86.3% | 66.3% | 45.0% | -------------------------------------------------- These measures are made in a specific context: one process accessing the file, no write. Probably the hardware is doing some optimization by doing a readahead. But at least it is a good starting point. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
author: Daniel Lezcano <daniel.lezcano@linaro.org> 2014-06-27 12:54:56 +0200
committer: Daniel Lezcano <daniel.lezcano@linaro.org> 2014-06-27 13:55:16 +0200
commit: cea7881da79342c03c6d21ea3f3d24c3ff830c8b (patch)
tree: f805b9ed3124472cdb97e412f6abb7ddb2b0fc74
2 files changed, 934 insertions, 0 deletions
diff --git a/iolatsimu.c b/iolatsimu.c
new file mode 100644
index 0000000..2d8badc
--- /dev/null
+++ b/iolatsimu.c
@@ -0,0 +1,345 @@
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/time.h>
+#include <fcntl.h>
+
+#include "list.h"
+
+/*
+ * That represents the resolution of the statistics in usec, the latency
+ * for a bucket is BUCKET_INTERVAL * index.
+ * The higher the resolution is the lesser good prediction you will have.
+ * Some measurements:
+ *
+ * For 1ms:
+ *  SSD 6Gb/s       : 99.7%
+ *  SD card class 10: 97.7%
+ *  SD card class 4 : 54.3%
+ *  HDD on USB      : 93.6%
+ *
+ * For 500us:
+ *  SSD 6Gb/s               : 99.9%
+ *  SD card class 10        : 96.8%
+ *  SD card class 4         : 55.8%
+ *  HDD on USB              : 86.3%
+ *
+ * For 200us:
+ *  SSD 6Gb/s               : 99.7%
+ *  SD card class 10        : 95.5%
+ *  SD card class 4         : 29.5%
+ *  HDD on USB              : 66.3%
+ *
+ * For 100us:
+ *  SSD 6Gb/s               : 85.7%
+ *  SD card class 10        : 67.63%
+ *  SD card class 4         : 31.4%
+ *  HDD on USB              : 44.97%
+ *
+ * Aiming a 100% is not necessary good because we want to hit the correct
+ * idle state. Setting a low resolution will group the different latencies
+ * into a big interval which may overlap with the cpuidle state target
+ * residency.
+ *
+ */
+#define BUCKET_INTERVAL 500
+
+/*
+ * Number of successive hits for the same bucket. That is the thresold
+ * triggering the move of the element at the beginning of the list, so
+ * becoming more weighted for the statistics when guessing for the next
+ * latency.
+ */
+#define BUCKET_SUCCESSIVE 5
+
+/*
+ * For debugging purpose
+ */
+static int success   = 0;
+static int nrlatency = 0;
+
+/*
+ * What is a bucket ?
+ *
+ * A bucket is an interval of latency. This interval is defined with the
+ * BUCKET_INTERVAL. The bucket index gives what latency interval we have.
+ * For example, if you have an index 2 and a bucket interval of 1000usec,
+ * then the bucket contains the latencies 2000 and 2999 usec.
+ *
+ */
+struct bucket {
+	int hits;
+	int successive_hits;
+	int index;
+	struct list_head list;
+};
+
+static LIST_HEAD(bucket_list);
+
+/*
+ * Find a bucket associated with the specified index
+ */
+static struct bucket *bucket_find(int index)
+{
+	struct list_head *list;
+	struct bucket *bucket = NULL;
+
+	list_for_each(list, &bucket_list) {
+		
+		bucket = list_entry(list, struct bucket, list);
+
+		if (bucket->index == index)
+			return bucket;
+	}
+
+	return NULL;
+}
+
+/*
+ * Allocate a bucket
+ */
+static struct bucket *bucket_alloc(int index)
+{
+	struct bucket *bucket;
+
+	bucket = malloc(sizeof(*bucket));
+	if (bucket) {
+		bucket->hits  = 0;
+		bucket->successive_hits = 0;
+		bucket->index = index;
+		INIT_LIST_HEAD(&bucket->list);
+	}
+	
+	return bucket;
+}
+
+/*
+ * The list is ordered by history. The first element is the one with
+ * the more *successive* hits. This function is called each time a new
+ * latency is inserted. In the kernel that will replace the io latency
+ * avg, which is pretty simple and inaccurate.
+ *
+ * The algorithm is pretty simple here: As the first element is the
+ * one which more chance to occur next, its weight is the bigger, the
+ * second one has less weight, etc ...
+ *
+ * The bucket which has the maximum score (number of hits weighted by
+ * its position in the list) is the next bucket which has more chances
+ * to occur.
+ *
+ */
+static int bucket_guessed_index(void)
+{
+	int weight = 0;
+	int score, score_max = 0, winner = 0;
+	struct bucket *bucket;
+	struct list_head *list;
+
+	if (list_empty(&bucket_list))
+		return -1;
+
+	list_for_each(list, &bucket_list) {
+
+		bucket = list_entry(list, struct bucket, list);
+		
+		/* 
+		 * The list is ordered by history, the first element has
+		 * more weight the next one. If a bucket is in the process
+		 * of being hit several times, take it into account.
+		 */
+		score = bucket->hits / (1 << weight);
+
+		if (score < score_max)
+			continue;
+
+		score_max = score;
+		winner = bucket->index;
+	}
+
+	return winner;
+}
+
+/*
+ * Return the bucket index for the specified latency
+ */
+static int bucket_index(int latency)
+{
+	return latency / BUCKET_INTERVAL;
+}
+
+/*
+ * The dynamic of the list is the following.
+ * - Each new element is inserted at the end of the list
+ * - Each element passing <BUCKET_SUCCESSIVE> times in this function
+ *   is elected to be moved at the beginning at the list
+ */
+static int bucket_fill(int latency)
+{
+	int index = bucket_index(latency);
+	struct bucket *bucket;
+
+	/*
+	 * For debugging purpose
+	 */
+	nrlatency++;
+	if (bucket_guessed_index() == index)
+		success++;
+
+	/*
+	 * Find the bucket associated with the index
+	 */
+	bucket = bucket_find(index);
+	if (!bucket) {
+		bucket = bucket_alloc(index);
+		if (!bucket)
+			return -1;
+
+		list_add_tail(&bucket->list, &bucket_list);
+	}
+
+	/*
+	 * Increase the number of times this bucket has been hit
+	 */
+	bucket->hits++;
+	bucket->successive_hits++;
+
+	/*
+	 * We had a successive number of the same bucket, move it at
+	 * the beginning of the list
+	 */
+	if (bucket->successive_hits == BUCKET_SUCCESSIVE) {
+		list_move(&bucket->list, &bucket_list);
+		bucket->successive_hits = 1;
+	}
+
+	return 0;
+}
+
+/*
+ * For debugging purpose
+ */
+static void bucket_show(void)
+{
+	struct list_head *list;
+	struct bucket *bucket;
+
+	list_for_each(list, &bucket_list) {
+		
+		bucket = list_entry(list, struct bucket, list);
+
+		printf("bucket %d: %d\n",
+		       bucket->index, bucket->hits);
+	}
+
+	printf("Number of correct predictions: %d/%d (%.2f%%)\n",
+	       success, nrlatency, ((float)success / (float)nrlatency) * 100.0);
+}
+
+/*
+ * IO latency simulation main function.
+ *
+ * Writes a buffer at different offset and reads it again
+ * at random offset. Use fadvise to prevent OS optimization
+ * keeping the data in cache, so having real access time.
+ */
+
+#define NROFFSET 256
+#define PAGESIZE 16384
+
+static char buffer[PAGESIZE];
+
+/*
+ * Write the big file with size NROFFSET * PAGESIZE
+ */
+void write_file(int fd)
+{
+	int i = 0;
+
+	for (i = 0; i < NROFFSET; i++)
+		write(fd, buffer, sizeof(buffer));
+
+	fsync(fd);
+}
+
+int mktempfile(const char *dirname)
+{
+	char *name;
+
+	if (asprintf(&name, "%s/XXXXXX", dirname) < 0) {
+		perror("asprintf");
+		return -1;
+	}
+
+	return mkstemp(name);
+}
+
+int main(int argc, char *argv[])
+{
+	struct timeval begin, end;
+	off_t offset;
+	int i;
+	unsigned long int latency;
+	const char *dirname = "/tmp";
+	int fd;
+
+	/* 
+	 * Optionnally we can specify the directory to test the
+	 * latencies
+	 */
+	if (argc == 2)
+		dirname = argv[1];
+
+	/*
+	 * Make temporary file, we don't want to pollute the file system
+	 * with big files if this program crashes
+	 */
+	fd = mktempfile(dirname);
+	if (fd < 0) {
+		perror("mktempfile");
+		return -1;
+	}
+
+	write_file(fd);
+
+	/*
+	 * Initialize the random seed with the number of current usec.
+	 * This random value will be used to access the file randomly, no
+	 * sequential accesses which can be optimized by the hardware
+	 */
+	gettimeofday(&begin, NULL);
+	srandom(begin.tv_usec);
+
+	for (i = 0; i < 10000; i++) {
+
+		/*
+		 * Compute the offset address to read from
+		 */
+		offset = (random() % NROFFSET) * PAGESIZE;
+
+		/*
+		 * man posix_fadvise
+		 */
+		posix_fadvise(fd, offset, PAGESIZE, POSIX_FADV_DONTNEED);
+
+		/*
+		 * Measure the time to read a PAGESIZE buffer
+		 */
+		gettimeofday(&begin, NULL);
+		pread(fd, buffer, PAGESIZE, offset);
+		gettimeofday(&end, NULL);
+		latency = ((end.tv_sec - begin.tv_sec) * 1000000) + (
+			end.tv_usec - begin.tv_usec);
+
+		/*
+		 * Fill a bucket with this latency
+		 */
+		bucket_fill(latency);
+	}
+
+	bucket_show();
+
+	close(fd);
+
+	return 0;
+}
diff --git a/list.h b/list.h
new file mode 100644
index 0000000..41d74c0
--- /dev/null
+++ b/list.h
@@ -0,0 +1,589 @@
+/*
+ *  list.h
+ *
+ *  Copyright (C) 2014, Linaro Limited.
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful, but
+ *  WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, write to the Free Software Foundation, Inc.,
+ *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * Contributors:
+ *     Daniel Lezcano <daniel.lezcano@linaro.org>
+ *     Zoran Markovic <zoran.markovic@linaro.org>
+ *
+ */
+#ifndef _LINUX_LIST_H
+#define _LINUX_LIST_H
+
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+
+#define LIST_POISON1 ((void *)0x00100100)
+#define LIST_POISON2 ((void *)0x00200200)
+
+struct list_head {
+	struct list_head *next, *prev;
+};
+
+/*
+ * Simple doubly linked list implementation.
+ *
+ * Some of the internal functions ("__xxx") are useful when
+ * manipulating whole lists rather than single entries, as
+ * sometimes we already know the next/prev entries and we can
+ * generate better code by using them directly rather than
+ * using the generic single-entry routines.
+ */
+
+#define LIST_HEAD_INIT(name) { &(name), &(name) }
+
+#define LIST_HEAD(name) \
+	struct list_head name = LIST_HEAD_INIT(name)
+
+static inline void INIT_LIST_HEAD(struct list_head *list)
+{
+	list->next = list;
+	list->prev = list;
+}
+
+/*
+ * Insert a new entry between two known consecutive entries.
+ *
+ * This is only for internal list manipulation where we know
+ * the prev/next entries already!
+ */
+static inline void __list_add(struct list_head *new,
+			      struct list_head *prev,
+			      struct list_head *next)
+{
+	next->prev = new;
+	new->next = next;
+	new->prev = prev;
+	prev->next = new;
+}
+
+/**
+ * list_add - add a new entry
+ * @new: new entry to be added
+ * @head: list head to add it after
+ *
+ * Insert a new entry after the specified head.
+ * This is good for implementing stacks.
+ */
+static inline void list_add(struct list_head *new, struct list_head *head)
+{
+	__list_add(new, head, head->next);
+}
+
+
+/**
+ * list_add_tail - add a new entry
+ * @new: new entry to be added
+ * @head: list head to add it before
+ *
+ * Insert a new entry before the specified head.
+ * This is useful for implementing queues.
+ */
+static inline void list_add_tail(struct list_head *new, struct list_head *head)
+{
+	__list_add(new, head->prev, head);
+}
+
+/*
+ * Delete a list entry by making the prev/next entries
+ * point to each other.
+ *
+ * This is only for internal list manipulation where we know
+ * the prev/next entries already!
+ */
+static inline void __list_del(struct list_head *prev, struct list_head *next)
+{
+	next->prev = prev;
+	prev->next = next;
+}
+
+/**
+ * list_del - deletes entry from list.
+ * @entry: the element to delete from the list.
+ * Note: list_empty() on entry does not return true after this, the entry is
+ * in an undefined state.
+ */
+static inline void __list_del_entry(struct list_head *entry)
+{
+	__list_del(entry->prev, entry->next);
+}
+
+static inline void list_del(struct list_head *entry)
+{
+	__list_del(entry->prev, entry->next);
+	entry->next = LIST_POISON1;
+	entry->prev = LIST_POISON2;
+}
+
+/**
+ * list_replace - replace old entry by new one
+ * @old : the element to be replaced
+ * @new : the new element to insert
+ *
+ * If @old was empty, it will be overwritten.
+ */
+static inline void list_replace(struct list_head *old,
+				struct list_head *new)
+{
+	new->next = old->next;
+	new->next->prev = new;
+	new->prev = old->prev;
+	new->prev->next = new;
+}
+
+static inline void list_replace_init(struct list_head *old,
+					struct list_head *new)
+{
+	list_replace(old, new);
+	INIT_LIST_HEAD(old);
+}
+
+/**
+ * list_del_init - deletes entry from list and reinitialize it.
+ * @entry: the element to delete from the list.
+ */
+static inline void list_del_init(struct list_head *entry)
+{
+	__list_del_entry(entry);
+	INIT_LIST_HEAD(entry);
+}
+
+/**
+ * list_move - delete from one list and add as another's head
+ * @list: the entry to move
+ * @head: the head that will precede our entry
+ */
+static inline void list_move(struct list_head *list, struct list_head *head)
+{
+	__list_del_entry(list);
+	list_add(list, head);
+}
+
+/**
+ * list_move_tail - delete from one list and add as another's tail
+ * @list: the entry to move
+ * @head: the head that will follow our entry
+ */
+static inline void list_move_tail(struct list_head *list,
+				  struct list_head *head)
+{
+	__list_del_entry(list);
+	list_add_tail(list, head);
+}
+
+/**
+ * list_is_last - tests whether @list is the last entry in list @head
+ * @list: the entry to test
+ * @head: the head of the list
+ */
+static inline int list_is_last(const struct list_head *list,
+				const struct list_head *head)
+{
+	return list->next == head;
+}
+
+/**
+ * list_empty - tests whether a list is empty
+ * @head: the list to test.
+ */
+static inline int list_empty(const struct list_head *head)
+{
+	return head->next == head;
+}
+
+/**
+ * list_empty_careful - tests whether a list is empty and not being modified
+ * @head: the list to test
+ *
+ * Description:
+ * tests whether a list is empty _and_ checks that no other CPU might be
+ * in the process of modifying either member (next or prev)
+ *
+ * NOTE: using list_empty_careful() without synchronization
+ * can only be safe if the only activity that can happen
+ * to the list entry is list_del_init(). Eg. it cannot be used
+ * if another CPU could re-list_add() it.
+ */
+static inline int list_empty_careful(const struct list_head *head)
+{
+	struct list_head *next = head->next;
+	return (next == head) && (next == head->prev);
+}
+
+/**
+ * list_rotate_left - rotate the list to the left
+ * @head: the head of the list
+ */
+static inline void list_rotate_left(struct list_head *head)
+{
+	struct list_head *first;
+
+	if (!list_empty(head)) {
+		first = head->next;
+		list_move_tail(first, head);
+	}
+}
+
+/**
+ * list_is_singular - tests whether a list has just one entry.
+ * @head: the list to test.
+ */
+static inline int list_is_singular(const struct list_head *head)
+{
+	return !list_empty(head) && (head->next == head->prev);
+}
+
+static inline void __list_cut_position(struct list_head *list,
+		struct list_head *head, struct list_head *entry)
+{
+	struct list_head *new_first = entry->next;
+	list->next = head->next;
+	list->next->prev = list;
+	list->prev = entry;
+	entry->next = list;
+	head->next = new_first;
+	new_first->prev = head;
+}
+
+/**
+ * list_cut_position - cut a list into two
+ * @list: a new list to add all removed entries
+ * @head: a list with entries
+ * @entry: an entry within head, could be the head itself
+ *	and if so we won't cut the list
+ *
+ * This helper moves the initial part of @head, up to and
+ * including @entry, from @head to @list. You should
+ * pass on @entry an element you know is on @head. @list
+ * should be an empty list or a list you do not care about
+ * losing its data.
+ *
+ */
+static inline void list_cut_position(struct list_head *list,
+		struct list_head *head, struct list_head *entry)
+{
+	if (list_empty(head))
+		return;
+	if (list_is_singular(head) &&
+		(head->next != entry && head != entry))
+		return;
+	if (entry == head)
+		INIT_LIST_HEAD(list);
+	else
+		__list_cut_position(list, head, entry);
+}
+
+static inline void __list_splice(const struct list_head *list,
+				 struct list_head *prev,
+				 struct list_head *next)
+{
+	struct list_head *first = list->next;
+	struct list_head *last = list->prev;
+
+	first->prev = prev;
+	prev->next = first;
+
+	last->next = next;
+	next->prev = last;
+}
+
+/**
+ * list_splice - join two lists, this is designed for stacks
+ * @list: the new list to add.
+ * @head: the place to add it in the first list.
+ */
+static inline void list_splice(const struct list_head *list,
+				struct list_head *head)
+{
+	if (!list_empty(list))
+		__list_splice(list, head, head->next);
+}
+
+/**
+ * list_splice_tail - join two lists, each list being a queue
+ * @list: the new list to add.
+ * @head: the place to add it in the first list.
+ */
+static inline void list_splice_tail(struct list_head *list,
+				struct list_head *head)
+{
+	if (!list_empty(list))
+		__list_splice(list, head->prev, head);
+}
+
+/**
+ * list_splice_init - join two lists and reinitialise the emptied list.
+ * @list: the new list to add.
+ * @head: the place to add it in the first list.
+ *
+ * The list at @list is reinitialised
+ */
+static inline void list_splice_init(struct list_head *list,
+				    struct list_head *head)
+{
+	if (!list_empty(list)) {
+		__list_splice(list, head, head->next);
+		INIT_LIST_HEAD(list);
+	}
+}
+
+/**
+ * list_splice_tail_init - join two lists and reinitialise the emptied list
+ * @list: the new list to add.
+ * @head: the place to add it in the first list.
+ *
+ * Each of the lists is a queue.
+ * The list at @list is reinitialised
+ */
+static inline void list_splice_tail_init(struct list_head *list,
+					 struct list_head *head)
+{
+	if (!list_empty(list)) {
+		__list_splice(list, head->prev, head);
+		INIT_LIST_HEAD(list);
+	}
+}
+
+#undef offsetof
+#define offsetof(s, m)      ((size_t)&(((s *)0)->m))
+
+#undef container_of
+#define container_of(ptr, type, member) ({			\
+	const typeof(((type *)0)->member) * __mptr = (ptr);	\
+	(type *)((char *)__mptr - offsetof(type, member)); })
+
+/**
+ * list_entry - get the struct for this entry
+ * @ptr:	the &struct list_head pointer.
+ * @type:	the type of the struct this is embedded in.
+ * @member:	the name of the list_struct within the struct.
+ */
+#define list_entry(ptr, type, member) \
+	container_of(ptr, type, member)
+
+/**
+ * list_first_entry - get the first element from a list
+ * @ptr:	the list head to take the element from.
+ * @type:	the type of the struct this is embedded in.
+ * @member:	the name of the list_struct within the struct.
+ *
+ * Note, that list is expected to be not empty.
+ */
+#define list_first_entry(ptr, type, member) \
+	list_entry((ptr)->next, type, member)
+
+/**
+ * list_for_each	-	iterate over a list
+ * @pos:	the &struct list_head to use as a loop cursor.
+ * @head:	the head for your list.
+ */
+#define list_for_each(pos, head) \
+	for (pos = (head)->next; pos != (head); pos = pos->next)
+
+/**
+ * __list_for_each	-	iterate over a list
+ * @pos:	the &struct list_head to use as a loop cursor.
+ * @head:	the head for your list.
+ *
+ * This variant doesn't differ from list_for_each() any more.
+ * We don't do prefetching in either case.
+ */
+#define __list_for_each(pos, head) \
+	for (pos = (head)->next; pos != (head); pos = pos->next)
+
+/**
+ * list_for_each_prev	-	iterate over a list backwards
+ * @pos:	the &struct list_head to use as a loop cursor.
+ * @head:	the head for your list.
+ */
+#define list_for_each_prev(pos, head) \
+	for (pos = (head)->prev; pos != (head); pos = pos->prev)
+
+/**
+ * list_for_each_safe - iterate over a list safe against removal of list entry
+ * @pos:	the &struct list_head to use as a loop cursor.
+ * @n:		another &struct list_head to use as temporary storage
+ * @head:	the head for your list.
+ */
+#define list_for_each_safe(pos, n, head) \
+	for (pos = (head)->next, n = pos->next; pos != (head); \
+		pos = n, n = pos->next)
+
+/**
+ * list_for_each_prev_safe - iterate over a list backwards safe against removal of list entry
+ * @pos:	the &struct list_head to use as a loop cursor.
+ * @n:		another &struct list_head to use as temporary storage
+ * @head:	the head for your list.
+ */
+#define list_for_each_prev_safe(pos, n, head) \
+	for (pos = (head)->prev, n = pos->prev; \
+	     pos != (head); \
+	     pos = n, n = pos->prev)
+
+/**
+ * list_for_each_entry	-	iterate over list of given type
+ * @pos:	the type * to use as a loop cursor.
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ */
+#define list_for_each_entry(pos, head, member)				\
+	for (pos = list_entry((head)->next, typeof(*pos), member);	\
+	     &pos->member != (head);	\
+	     pos = list_entry(pos->member.next, typeof(*pos), member))
+
+/**
+ * list_for_each_entry_reverse - iterate backwards over list of given type.
+ * @pos:	the type * to use as a loop cursor.
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ */
+#define list_for_each_entry_reverse(pos, head, member)			\
+	for (pos = list_entry((head)->prev, typeof(*pos), member);	\
+	     &pos->member != (head);	\
+	     pos = list_entry(pos->member.prev, typeof(*pos), member))
+
+/**
+ * list_prepare_entry - prepare a entry for use in list_for_each_entry_continue()
+ * @pos:	the type * to use as a start point
+ * @head:	the head of the list
+ * @member:	the name of the list_struct within the struct.
+ *
+ * Prepares a entry for use as a start point in list_for_each_entry_continue().
+ */
+#define list_prepare_entry(pos, head, member) \
+	((pos) ? : list_entry(head, typeof(*pos), member))
+
+/**
+ * list_for_each_entry_continue - continue iteration over list of given type
+ * @pos:	the type * to use as a loop cursor.
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ *
+ * Continue to iterate over list of given type, continuing after
+ * the current position.
+ */
+#define list_for_each_entry_continue(pos, head, member)		\
+	for (pos = list_entry(pos->member.next, typeof(*pos), member);	\
+	     &pos->member != (head);	\
+	     pos = list_entry(pos->member.next, typeof(*pos), member))
+
+/**
+ * list_for_each_entry_continue_reverse - iterate backwards from the given point
+ * @pos:	the type * to use as a loop cursor.
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ *
+ * Start to iterate over list of given type backwards, continuing after
+ * the current position.
+ */
+#define list_for_each_entry_continue_reverse(pos, head, member)		\
+	for (pos = list_entry(pos->member.prev, typeof(*pos), member);	\
+	     &pos->member != (head);	\
+	     pos = list_entry(pos->member.prev, typeof(*pos), member))
+
+/**
+ * list_for_each_entry_from - iterate over list of given type from the current point
+ * @pos:	the type * to use as a loop cursor.
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ *
+ * Iterate over list of given type, continuing from current position.
+ */
+#define list_for_each_entry_from(pos, head, member)			\
+	for (; &pos->member != (head);	\
+	     pos = list_entry(pos->member.next, typeof(*pos), member))
+
+/**
+ * list_for_each_entry_safe - iterate over list of given type safe against removal of list entry
+ * @pos:	the type * to use as a loop cursor.
+ * @n:		another type * to use as temporary storage
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ */
+#define list_for_each_entry_safe(pos, n, head, member)			\
+	for (pos = list_entry((head)->next, typeof(*pos), member),	\
+		n = list_entry(pos->member.next, typeof(*pos), member);	\
+	     &pos->member != (head);					\
+	     pos = n, n = list_entry(n->member.next, typeof(*n), member))
+
+/**
+ * list_for_each_entry_safe_continue - continue list iteration safe against removal
+ * @pos:	the type * to use as a loop cursor.
+ * @n:		another type * to use as temporary storage
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ *
+ * Iterate over list of given type, continuing after current point,
+ * safe against removal of list entry.
+ */
+#define list_for_each_entry_safe_continue(pos, n, head, member)		\
+	for (pos = list_entry(pos->member.next, typeof(*pos), member),	\
+		n = list_entry(pos->member.next, typeof(*pos), member);	\
+	     &pos->member != (head);					\
+	     pos = n, n = list_entry(n->member.next, typeof(*n), member))
+
+/**
+ * list_for_each_entry_safe_from - iterate over list from current point safe against removal
+ * @pos:	the type * to use as a loop cursor.
+ * @n:		another type * to use as temporary storage
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ *
+ * Iterate over list of given type from current point, safe against
+ * removal of list entry.
+ */
+#define list_for_each_entry_safe_from(pos, n, head, member)		\
+	for (n = list_entry(pos->member.next, typeof(*pos), member);	\
+	     &pos->member != (head);					\
+	     pos = n, n = list_entry(n->member.next, typeof(*n), member))
+
+/**
+ * list_for_each_entry_safe_reverse - iterate backwards over list safe against removal
+ * @pos:	the type * to use as a loop cursor.
+ * @n:		another type * to use as temporary storage
+ * @head:	the head for your list.
+ * @member:	the name of the list_struct within the struct.
+ *
+ * Iterate backwards over list of given type, safe against removal
+ * of list entry.
+ */
+#define list_for_each_entry_safe_reverse(pos, n, head, member)		\
+	for (pos = list_entry((head)->prev, typeof(*pos), member),	\
+		n = list_entry(pos->member.prev, typeof(*pos), member);	\
+	     &pos->member != (head);					\
+	     pos = n, n = list_entry(n->member.prev, typeof(*n), member))
+
+/**
+ * list_safe_reset_next - reset a stale list_for_each_entry_safe loop
+ * @pos:	the loop cursor used in the list_for_each_entry_safe loop
+ * @n:		temporary storage used in list_for_each_entry_safe
+ * @member:	the name of the list_struct within the struct.
+ *
+ * list_safe_reset_next is not safe to use in general if the list may be
+ * modified concurrently (eg. the lock is dropped in the loop body). An
+ * exception to this is if the cursor element (pos) is pinned in the list,
+ * and list_safe_reset_next is called after re-taking the lock and before
+ * completing the current iteration of the loop body.
+ */
+#define list_safe_reset_next(pos, n, member)				\
+	(n = list_entry(pos->member.next, typeof(*pos), member))
+
+#endif
author	Daniel Lezcano <daniel.lezcano@linaro.org>	2014-06-27 12:54:56 +0200
committer	Daniel Lezcano <daniel.lezcano@linaro.org>	2014-06-27 13:55:16 +0200
commit	cea7881da79342c03c6d21ea3f3d24c3ff830c8b (patch)
tree	f805b9ed3124472cdb97e412f6abb7ddb2b0fc74