[MXNet] MXNet Notes I: Custom DataIter

Mainly focus on Custom DataIter in this section.

Posted by Geng Zhan on 2017-12-16

Custom Iter Tutorial: introduction about custom iter; method must be overwritten in custom iter.

DataBatch: cast light on provide_data and provide_label attributes of the custom data iter.

give a toy example of DataIter, notes as comments also are included in it.

Just a manually copy of @Custom Iter Tutorial with several comments for better understanding. So that avoid switching between the tutorial and databatch pages.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
class DemoIter(mx.io.DataIter):
self.__init__(self, data_names, data_shapes, data_gen,
label_names, label_shapes, label_gen,
num_batches = 10):
self._provide_data = zip(data_names, data_shapes) # fixed and must-have
self._provide_label = zip(label_names, label_shapes) # fixed and must-have
self.num_batches = num_batches # total batches in this training procedure
self.data_gen = data_gen # ?
# method for generating data
self.label_gen = label_gen # ?
self.cur_batch = 0 # indicate current batch index of total batches
# method for generating label

# returns an iterator object
# and is called implicitly called at the beginning of the loop
def __iter__(self):
return self

# for what use?
# TODO: basic python iter impletation
# ANS: need to implement __iter__(), next() for python2, __next__() for python3
# next method for python3: __next__()
# the same as the next(self) for python2
# maybe in that case the code is compatibale with python2 and python3
def __next__(self):
return self.next()

def reset(self):
self.cur_batch = 0

@property
def provide_data(self):
return self._provide_data
# refer to:
# self._provide_data = zip(data_names, data_shapes)

@property
def provide_label(self):
return self._provide_label
# refer to:
# self._provide_label = zip(label_names, label_shapes)

def next(self):
if self.cur_batch < self.num_batches:
self.cur_batch += 1
# provide_data, provide a list of DataDesc object, the i-th item describe the name and shape of data[i]
data = [mx.nd.array(g(d[1])) for d, g in zip(self._provide_data, self.data_gen)]
label = [mx.nd.array(g(d[1])) for d, g in zip(self._provide_label, self.label_gen)]
return mx.io.DataBatch(data, label)
else:
raise StopIteration

TODO: check and compare the code of DataIter for VID in R-FCN

A little worried about gf’s IELTS. Since the outcome of listening test tonight is totally catastrophe…

Nighty night, will resume this work tomorrow.

Oh, also need to buy knives for the fist meal. Remember this~